DevOps Engineer – SRE
Senior DevOps Engineer – SRELocation: CaliforniaDuration: 12+ monthsOnly for local candidates (California)OverviewWe are seeking a highly skilled Senior DevOps Engineer – Site Reliability Engineering (SRE) to lead the design, implementation, and reliability of scalable cloud infrastructure. This role focuses on ensuring high availability, performance optimization, and automation across AWS environments.The ideal candidate will bring deep expertise in AWS, monitoring, and automation, with a strong SRE mindset to support mission-critical applications in a 24/7 production environment. You will work closely with engineering and operations teams to build resilient systems, improve observability, and drive operational excellence.Required SkillsStrong hands-on experience with AWS cloud services and infrastructure managementExperience implementing alerts, alarms, and notifications using CloudWatch and/or DynatraceExperience working with AWS services such as Kafka, ECS, and EKSExpertise in Infrastructure as Code (IaC) using Terraform or AWS CDKStrong background in automation and configuration managementExperience with CI/CD pipelines (Jenkins, Azure DevOps, or similar tools)Proven Site Reliability Engineering (SRE) experience in production environmentsStrong Linux system administration and OS-level troubleshooting skillsExperience supporting 24/7 production environments, including incident response and RCASolid understanding of monitoring, observability, and performance tuningExperience with networking fundamentals (TCP/IP, DNS, load balancing)Preferred SkillsAWS certifications (DevOps Engineer or Solutions Architect)Experience with Ansible, Python scripting, or other automation toolsFamiliarity with high availability (HA) and disaster recovery (DR) architecturesExperience with container orchestration and microservices architectureKnowledge of security best practices and vulnerability management toolsExperience working in enterprise-scale environmentsExposure to Java/.NET application deploymentsUnderstanding of databases (SQL Server, Oracle)Strong troubleshooting and problem-solving skills across infrastructure and applicationsExperience with multi-region / multi-AZ AWS deployments