JOBSEARCHER

Lead Site Reliability Engineer

A company is looking for a Lead Site Reliability Engineer to own reliability outcomes for a modern SaaS platform. Key Responsibilities Define and drive reliability strategy across control-plane and data-plane systems Establish and operationalize SLOs, SLAs, and error budgets Lead incident management and drive systemic fixes for long-term reliability improvements Required Qualifications 6+ years leading delivery of complex, distributed systems or SaaS platforms Strong experience with multi-region, split-plane architectures Proven track record improving SLOs, MTTR, and system reliability at scale Proficiency in programming languages such as Python, Java, C++, or JavaScript Deep experience with Kubernetes, CI/CD, and Infrastructure as Code