<Back to Search
Site Reliability Engineering Manager
San Jose, CAMarch 20th, 2026
Job Description:Mandatory to have working experience as SRE manager especially in Retail domain application support ( NOT CLOUD /DevOps)Must have working knowledge on SRE principles such as Logs, metrics, availability metrics, uptime, ticket tracking, e-com services, ITIL framework specifically on Alerts, Incident, change management, CAB, Production deployments, Risk and mitigation plan, SLA, SLI, SLOHands on experience in Monitoring, Logging, Alerting, Dashboarding, and report generation in any observability tools Prefer DataDog or other tools such as Splunk/Dynatrace/ELK/Grafana). This engagement is a customer using Dynatrace,Splunk, PagerDuty hence it is good to have this expertiseMandatory to have work experience in leading Level 2/Level 3 application support team based out of IND who provide 24x7 coverage.Should know how to gather & communicate SRE requirement from customers and define SRE roadmap.Working experience on how to gather requirements on health of applications, services to monitor, setting service levels.Must have good knowledge on eCommerce platforms in microservice architecture, Sterling OMS , Retail Applications like XStore.Should be able to lead P1 calls, brief about the P1 to customer, proactive in gathering leads/ customers into the P1 calls till RCA, PIR etc.Should have knowledge on building process , framework by following ITSM principles, SOP, runbooks, handling any ITSM platforms (JIRA/ServiceNow/BMC Remedy)Must know how to work with the Dev team, cross functional teams.Should be able to generate WSR/MSR by extracting the tickets from ITSM platforms, present to customers and client leaders.Manage overall SRE delivery, customer focus mindset , closely work with customer leaderships.Preferred:Be a client face at customer site collaborating with client leadership.Ability to clearly communicate and understand a technical idea/concept.Ability to work in a professional environment while interacting with peers and stakeholders, collaborating with offshore teams.Excellent written and verbal communications skills.Motivated, goal driven, influential, innovative, curious, and open minded, fun to work with, collaborator.Capability to work with people in different time zones.Ability to operate in a fast-paced, evolving environment and appropriately prioritize tasks, and keep abreast of the latest technology.Collaborate with cloud architecture, infrastructure team, project management team, and technology services, management team.Create and maintain detailed documentation.
Showing 50 of 41,095 matching similar jobs in Springbrook, ND
- Site Reliability Engineering Manager
- Site Reliability Engineering Manager
- Site Reliability Engineering Manager
- Site Reliability Engineering Manager
- Site Reliability Engineering Manager
- Site Reliability Engineering Manager
- Site Reliability Engineering Manager
- Site Reliability Engineering Manager
- Site Reliability Engineering Manager
- Site Reliability Product Owner
- Director, Site Reliability Engineering
- Senior Site Reliability Engineer
- Site Reliability Engineer
- Site Reliability Engineer, GovCloud Incident Response (GIR)
- Site Reliability Engineer, Product - USDS
- Site Reliability Engineering Manager, Gaming - USDS
- Site Reliability Engineering Manager
- Site Reliability Engineering Manager
- Site Reliability Engineer
- Senior Site Reliability Engineer
- Site Reliability Engineer
- Site Reliability Engineer, Global E-commerce- USDS
- Site Reliability Engineer, Global E-commerce- USDS
- Senior Site Reliability Engineer, Global E-Commerce - USDS
- Senior Technology Site Reliability Engineer
- Plant Manager
- Lead Site Reliability Engineer
- Site Reliability Engineer - USDS (Multiple Positions)
- Senior Technology Site Reliability Engineer
- Senior Technology Site Reliability Engineer
- Senior Technology Site Reliability Engineer
- Senior Technology Site Reliability Engineer
- Future Ops Leader: 1-Year LDP with Rotations
- Refrigeration Supervisor - Lead, Optimize & Develop Ops
- Maintenance Lead, Plant Ops & Safety
- Senior Technology Site Reliability Engineer
- Crew Leader - Tire Plant Ops & Safety
- Director, Turbine Operations (Doña Ana County, New Mexico)
- Director, Turbine Operations (Doña Ana County, New Mexico)
- Director, Turbine Operations (Doña Ana County, New Mexico)