<Back to Search
Site Reliability Engineer
Mountain View, CAApril 3rd, 2026
Job Description
Site Reliability EngineerOnsite- Bay Area, CASkillsRelevant Skills and ExperienceWhat You'll Do (Day-to-Day)Own and manage our cloud infrastructure (GCP or AWS, on-prem).Build, maintain, and optimize Kubernetes clusters (including GPU-backed clusters).Implement and improve CI/CD pipelines (GitHub Actions).Write and maintain Infrastructure as Code (Terraform).Monitor system health and performance using Grafana and other observability tools.Ensure high availability, reliability, and uptime across platforms.Handle infrastructure maintenance, upgrades, and scaling.Administer and improve our platform architecture and apply general security best practices across the stack.Note: This is an internal-facing role — no customer interaction.Must-Have:4+ years in SRE, DevOps, or Infrastructure EngineeringSolid experience with GCP or AWS (hybrid/on-prem a plus)Experience with Kubernetes cluster management (GPU experience a bonus)Hands-on with Terraform and CI/CD (GitHub)Experience with monitoring/observability (Grafana, etc.)Strong understanding of high availability and infrastructure reliabilityFamiliarity with platform/cluster architecture and administrationSecurity mindset and ability to apply best practiceNice-to-Have:Startup experience (you enjoy building, not just maintaining)Experience with scalable GPU infrastructure for AI/ML
564 matching similar jobs near Mountain View, CA
- Head of Platform Integrations — AI Healthcare Systems
- Head of Platform Integrations — AI Healthcare Systems
- Senior Platform DB Architect - Hands-On - Remote
- Cloud Infrastructure Solution Data Engineer
- Principal Engineering Manager - Developer Platform
- Senior Software Engineer, Cloud Infrastructure
- Senior DevOps Engineer
- From Hobby to Income: Turn Gaming into a Paying Side Hustle
- Software Engineer- Oracle CPQ
- Senior Cloud Architect
- Head of Integration Engineering
- Senior Solution Architect – AI / GPU Cloud
- Senior Engineering Manager - Platform Integrations
- Senior Rust Architect: Scalable Distributed Systems Lead
- Oracle ERP Cloud / EPM Techno-Functional Consultant
- Production Engineer
- Java Production Support Engineer - Web/J2EE, Unix, SQL
- Principal Software Engineer, Ads Infrastructure
- Product Line Manager - Cloud & DaaS Solutions (Remote)
- Senior Software Engineer, Full Stack, Commerce
- Principal Software Engineer II (Performance Team) - Elasticsearch
- Senior Product Manager, Platform Delivery & Growth (Serverless) - Security Analytics
- Principal Software Engineer I / II - Query Engine, Database Internals - Elasticsearch
- Embedded Software Engineer
- Consulting Architect - Observability
- Level II/III Technician
- Sr. Salesforce Developer Opening in US
- Front-end Data Science Engineer
- Front End Engineer
- Salesforce Marketing cloud Architect opening in Grazitti Interactive Inc,US
- Senior Backend Engineer, RCM
- Senior Offensive Security Engineer - Web & AI Systems
- Technical Lead Manager, ML Platform Infrastructure
- Operations Associate/Senior Associate, Infrastructure Operations
- Senior Operations Manager, Infrastructure
- Senior IT Systems Engineer
- Lead Engineer | AI-Powered Governance Platform (DeFi)
- Senior Database Platform Architect/MariaDB (Hands-On) – Remote (U.S.)
- Staff Software Engineer, Android Experiences
- GRC Engineer