Program Architect
Job Title: SRE Program ArchitectThe SRE Program Architect plays a critical role in the observability and reliability engineering program. This role ensures enterprise-grade execution of observability initiatives, with responsibilities covering governance, design, implementation, and optimization of relevant tools and practices. The SRE Program Architect operates across hybrid/multi-cloud environments, integrates with ITSM, and ensures alignment to SLO-driven service outcomes. This position requires both strong technical experience and enterprise delivery expertise.ResponsibilitiesDefine and execute standards, frameworks, and playbooks aligned to observability objectives.Collaborate with cross-functional teams (DevOps, SRE, application owners, infra teams) to ensure adoption.Ensure data, metrics, logs, traces, and events converge into actionable insights.Integrate tooling(Dynatrace, LogicMonitor, ELK, ServiceNow) into CI/CD and operational workflows.Build and maintain dashboards, KPIs, and reporting packs to support stakeholders at all levels.Support regulatory compliance, risk management, and audit readiness through observability practices.Mentor team members and contribute to knowledge sharing and process maturity.Required Skills8–12 years relevant experience in enterprise-scale IT, monitoring, or observability programs.Proven expertise in key observability platforms (Dynatrace, LogicMonitor, ELK, ServiceNow).Strong experience in hybrid/multi-cloud environments (AWS, Azure, GCP, VMware).Hands-on automation/IaC (Terraform, Ansible, GitOps, YAML).Excellent understanding of ITIL and SRE practices (SLIs, SLOs, error budgets).Ability to work with globally distributed teams and manage stakeholder expectations.Strong problem-solving, communication, and leadership skills.Preferred SkillsExposure to OpenTelemetry, Prometheus, Grafana, and modern observability stacks.Familiarity with DynatraceGrail/DQL and AI-basedanomaly detection.Knowledge of cost optimization and FinOps practices in observability platforms.Industry certifications in observability, SRE, ITIL, or cloud (AWS/Azure/GCP).Experience in regulated industries (finance, healthcare, public sector).Tool PrioritiesMonitoring/APM: Dynatrace (APM, RUM, Synthetics, Monaco/YAML), LogicMonitor.Logging: ELK stack (Elasticsearch, Logstash,Kibana, Beats).Automation/IaC: Terraform, Ansible, GitOps pipelines, YAML configs.ITSM: ServiceNow (Event Management., CMDB, Incident/Problem flows).Analytics/Reporting: Power BI, Grafana, QBR dashboards.What Sets You ApartClarity from Complexity – Turns noise into structured, high-value outcomes.First-Principles Thinking – Challenges assumptions, connects patterns, craftsdurable solutions.High-Pressure Excellence – Delivers with clarity and resilience under tight deadlines.Critical Path Ownership – Unblocks dependencies, revives at-risk initiatives, drives momentum in complex, multi-stakeholder environments.Elite Communication – Distills complexity into compelling narratives at multiple altitudes(executives, peers, customers), influencing outcomes.Breadth + Depth– Orchestrates across tracks while diving deep when it matters.Adaptive Learning – Evolves rapidly,future-ready.Ethical Judgment– Anticipates downstream consequences of decisions, champions fairness and responsibility. Champions fairness, safeguards trust.