JOBSEARCHER

SRE Sr Leader- REMOTE

Simple SolutionsRemoteMay 23rd, 2026
Overview We are seeking an SRE Senior Leader to drive system uptime, performance, and scalability by blending software engineering with operational expertise. They lead teams to define SLIs/SLOs, automate infrastructure (IaC), manage incidents, and conduct post‐mortems. Key roles include mentoring engineers, setting reliability strategies, and optimizing cloud costs.Core ResponsibilitiesLeadership & Mentoring: Lead a team of SREs, manage sprint planning, and foster career growth.System Reliability & Strategy: Own the uptime, performance, and capacity planning of production systems.Automation & Tools: Reduce manual work (toil) by building automation, managing infrastructure as code (Terraform, Kubernetes), and enhancing observability.Incident Management: Drive root cause analysis (RCA), lead incident responses, and implement post‐mortem action items.SLI/SLO Management: Define Service Level Indicators (SLIs) and Service Level Objectives (SLOs) to balance velocity and reliability.Required Skills and QualificationsTechnical Expertise: Proficiency in coding/scripting (e.g., Python, Go) and familiarity with CI/CD tools.Infrastructure Skills: Strong knowledge of cloud platforms (AWS, GCP, Azure), Linux, networking, and containerization (Kubernetes).Leadership Experience: Proven experience leading technical teams and managing complex projects.Communication: Ability to communicate technical SRE initiatives to stakeholders across the organization.Preferred Experience5+ years in SRE, DevOps, or Software Engineering.Experience in managing 24/7 high‐availability production environments.#J-18808-Ljbffr