Upvote
Downvote
Site Reliability Engineer (US)
Share Job
- Suggest Revision
$120,000 - $140,000 a year
Full-time
- Specializing in managed services across Google Cloud Platform (GCP) and Amazon Web Services (AWS), we seek a dedicated Site Reliability Engineer (SRE) who is passionate about technology, excels in problem-solving, and is dedicated to providing unparalleled customer service.
- You will become the SME to the scale, resiliency and uptime of our own and the customer environments we support.
- Role Summary As a critical member of our team, the SRE will provide technical support and expertise to our managed services clients.
- Design, implement, and maintain monitoring and alerting systems to detect and address issues proactively, using mainly Datadog, GCP Cloud Monitoring and Pagerduty/Incident.io. Debug and troubleshoot production issues across various customer environments, technology stacks, and cloud providers, primarily focusing on GCP and AWS. Participate in an on-call rotation to respond to and resolve production incidents and conduct RCAs/Post Mortems to identify and address issues.
- Develop and maintain IaC (Terraform) and Configuration Management (Ansible, Helm as examples) Work closely with development teams to understand system architecture, identify potential reliability risks, and implement solutions.
Active Job
Updated TodaySimilar Job
Relevance
Active