SRE Architect
Job Title: SRE ArchitectLocation: Atlanta GA ( hybrid) Key Responsibilities• Perform SRE operations for distributed systems, ensuring high availability, reliability, and operational excellence.• AI in SRE• Partner with application/domain teams to strengthen their SRE maturity and operational readiness.• Write automation, scripts, and REST APIs to integrate with external systems and eliminate repetitive tasks.• Onboard services to Dynatrace/observability platforms; define dashboards, alerts, SLIs, SLOs.• Architect and implement resiliency patterns including failover strategies, circuit breakers, graceful degradation.• Drive cost optimization (FinOps) initiatives across cloud workloads.• Support AWS (or other cloud platforms) operations and engineering needs.• Work with ROSA/container platforms for deployment, scaling, and reliability.• Recommend improvements in technology, architecture, and domain-specific reliability areas.• Manage and support large scale systems operating at scale.• Reduce toil by identifying repetitive tasks and automating them.• Contribute code, read/interpret service repositories, and assist teams with engineering tasks as needed. Required Skills & Experience• Strong background in SRE operations for distributed systems.• Proficiency in development/coding (Python, Go, shell scripting, or similar).• Ability to read/interpret codebases and build REST APIs.• Experience with Dynatrace/observability onboarding and ecosystem.• Deep knowledge of resiliency engineering and failover strategies.• Strong understanding of FinOps principles and cloud cost optimization.• Hands-on experience with AWS or any other cloud provider.• Experience with ROSA or Kubernetes-based container platforms.• Proven automation skills to eliminate operational toil.• Experience managing large-scale systems in production.• Capability to suggest architecture and domain improvements.• Strong analytical, troubleshooting, and collaboration skills.Thanks and Regards Syed MoizEmail: Syed@exatechinc.comPhone: 512 399 9034