Cloud Engineer – Observability and SRE (Santa Clara)
Akkodis is hiring a Cloud Engineer – Observability and SRE for a 7 month contract role in San Francisco Bay Area on a hybrid work setting. We are seeking an experienced SRE/Cloud Engineer with 8+ years of hands-on expertise in large-scale Kubernetes and AWS production environments, CI/CD, Terraform automation, observability tools such as Prometheus, Grafana, ELK/OpenSearch or Splunk, along with strong production support, incident response, troubleshooting, and distributed systems reliability experience.Pay Range: $55.00/hour - $62.00/hour on W2 without benefits (The rate may be negotiable based on experience, education, geographic location, and other factors.Hybrid: 3days onsite/weekPosition SummaryThe Grade 10 Cloud Engineer within the Customer's Cloud Collaboration Technology Group will play a key role in building and operating scalable observability and infrastructure platforms supporting Webex microservices. This role requires strong hands-on expertise in Kubernetes, cloud infrastructure, and observability systems, along with the ability to operate independently and to own components end-to-end in production environments. Candidates will demonstrate extensive use of generative AI tools for code generation and production system troubleshooting.Key ResponsibilitiesDesign, develop, and operate observability platforms – to perform logging, metrics, and/or tracing – for Webex microservices.Manage and optimize Kubernetes clusters across multi-region environments.Own CI/CD pipelines using Argo CD and Helm.Implement Infrastructure as code (IAC) using Terraform on AWS.Operate monitoring ecosystems, including but not limited to: OpenSearch/ELK, Prometheus, Grafana, Splunk, and Kafka.Build automation to detect and remediate production issues.Ensure security compliance through vulnerability patching.Collaborate cross-functionally to improve reliability.Participate in on-call rotations and incident response.Contribute to distributed system design and operations.Required Education:Bachelor's degree in computer science or related field.General Technical Skills:At least 8 years of experience in a DevOps and/or SRE platform engineering roleIncident response and on-call operations: Demonstrated experience in a 24/7 production environment, including but not limited to:Triaging alertsLeading incident responseWriting post-incident reviewsMaintaining SLA commitments across large-scale distributed systemsIaC and automation: Proficiency with Terraform, Ansible, and/or equivalent IaC tooling for provisioning and managing cloud infrastructure at scale on AWSScripting and development: Working proficiency in Python, Golang, and/or Bash for building automation scripts, operational tooling, and/or CI/CD pipeline integrations (e.g., Drone, GitHub Actions, Argo CD).Specific Technical Skills:Kubernetes and container orchestration: Production experience operating and troubleshooting workloads on Kubernetes at large scale (i.e., hundreds of deployments and thousands of pods), including but not limited to:Helm chart managementPod schedulingResource tuningMulti-cluster operationsObservability stack expertise: Hands-on experience – performing pipeline design, query optimization, and/or capacity planning for high-volume environments – in at least two (2) of the following:OpenSearch/ElasticsearchPrometheus/MimirGrafanaLokiSplunkLogstashDesired SkillsApache Kafka/AWS MSK: Experience in at least one (1) of the following:Operating or tuning Kafka clusters at scaleManaging the following across high-throughput streaming pipelines:Topic configurations,ACLs,Consumer lag, and/orSchema registriesEqual Opportunity Employer/Veterans/DisabledBenefit offerings available for our associates include medical, dental, vision, life insurance, short-term disability, additional voluntary benefits, an EAP program, commuter benefits, and a 401K plan. Our benefit offerings provide employees the flexibility to choose the type of coverage that meets their individual needs. In addition, our associates may be eligible for paid leave, including Paid Sick Leave or any other paid leave required by Federal, State, or local law, as well as Holiday pay where applicable. Disclaimer: These benefit offerings do not apply to client-recruited jobs and jobs that are direct hires to a client.To read our Candidate Privacy Information Statement, which explains how we will use your information, please visitThe Company will consider qualified applicants with arrest and conviction records in accordance with federal, state, and local laws and/or security clearance requirements, including, as applicable:· The California Fair Chance Act· Los Angeles City Fair Chance Ordinance· Los Angeles County Fair Chance Ordinance for Employers· San Francisco Fair Chance OrdinanceThanks & RegardsAditya Agnihotri [Aadi]Sr. RecruiterAkkodis (An Adecco Group Company)