- UpvoteDownvoteShare Job
- Suggest Revision
SRE RequirementsProficiency with Infrastructure as Code / GitOps tooling. This role is for an SRE who is passionate about leveraging data and automation to drive a highly dynamic infrastructure.
ExpandApply NowActive JobUpdated Today - UpvoteDownvoteShare Job
- Suggest Revision
Monitor infrastructure using SRE tools and suggest tools as necessary. Experience with automation tooling such as Chef, Docker, AWS. As the solution scales, ensure reliability through designing, building, and maintaining the core infrastructure.
ExpandApply NowActive JobUpdated Today - UpvoteDownvoteShare Job
- Suggest Revision
Develop and drive the overall reliability strategy for the Network and DC -O ps SRE organization , aligning it with the organization's business goals and objectives. Experience with infrastructure automation, tooling, and configuration management frameworks (e.g., Puppet, Chef, Ansible, Pulumi , Terraform, etc.
ExpandApply NowActive JobUpdated Today - UpvoteDownvoteShare Job
- Suggest Revision
Serve as an SRE to proactively establish the means (through tooling) to effectively monitor, analyze, report, and observe the health and upkeep of the systems and/or environments. Serve as an SRE to proactively establish the means (through tooling) to effectively monitor, analyze, report, and observe the health and upkeep of the systems and/or environments.
ExpandApply NowActive JobUpdated Today - UpvoteDownvoteShare Job
- Suggest Revision
Apple Corporate Systems team is seeking a Lead Software Reliability Engineer to work on providing our applications with top SRE practices and scale our tooling and instrumentation to prepare for further growth.
ExpandApply NowActive JobUpdated Today - UpvoteDownvoteShare Job
- Suggest Revision
We are looking for a Site Reliability Engineer (SRE) to join the IT Operations Corporate Engineering team to build and maintain the tooling to secure and manage our fleet across multiple platforms (macOS, ChromeOS, Windows) and across various hardware types (laptop, desktop, mobile.
ExpandApply NowActive JobUpdated Today - UpvoteDownvoteShare Job
- Suggest Revision
SRE (Site Reliability Engineer) / DevOps mindset & experience. Experience deploying IaC code via a pipeline / tooling versus using the command line on their local workstation (with YAML and AZ-CLI – the Azure Command Line interface.
ExpandUpdated Today - UpvoteDownvoteShare Job
- Suggest Revision
Members of this team will be challenged to combine the tools and patterns of the modern Kubernetes ecosystem with learnings gained from experiences with CI/CD, distributed systems orchestration, traffic shaping, progressive rollouts, application packaging, channel management, and SRE fundamentals to develop CoreWeave’s next-generation solution for internal service delivery.
$165,000 - $200,000 a yearFull-timeExpandUpdated Today - UpvoteDownvoteShare Job
- Suggest Revision
Experience configuring and implementing modern IaC tooling such as Terraform and Ansible. Familiarity with security tooling such as: zero trust networking, device encryption & key escrow, anti-malware tooling, PKI, patch management, and application allow/deny-listing.
ExpandApply NowActive JobUpdated Today - UpvoteDownvoteShare Job
- Suggest Revision
Establish key practices to ensure the availability, stability, scalability, performance, monitoring, incident response are handled appropriately through a means of Automation. Construct and maintain an incident response playbook with documented corrective actions.
ExpandApply NowActive JobUpdated Today - UpvoteDownvoteShare Job
- Suggest Revision
You are well versed with infrastructure tooling such as Terraform, worked with Kubernetes, and have strong SRE skill sets. We currently operate two real-time storage systems: CosmosDB and PostgresDB. As an Online Storage Infrastructure engineer, you’ll contribute to the reliability, efficiency, high availability, security, and user-friendliness of our database systems in a fast paced environment.
$160,000 - $385,000 a yearFull-timeExpandUpdated Today - UpvoteDownvoteShare Job
- Suggest Revision
Azure Databases (Az-SQL, Cosmo DB, etc.) Use of IaC / Scripting technologies such as (Terraform/ BICEP). Network Security (Azure Firewall, Service Firewall, NSG, WAF). The Cloud Engineer will demonstrate knowledge of Cloud Architecture best practices, Cloud Security, Cloud Infrastructure monitoring & tuning.
ExpandApply NowActive JobUpdated Today - UpvoteDownvoteShare Job
- Suggest Revision
Participate in oncall for all tools in the Reliability and resiliency space, which are usually written by SRE. SRE at Roblox are not operations engineers; they are infra-oriented software engineers who code within the platform and tooling feature sets.
ExpandApply NowActive JobUpdated Today - UpvoteDownvoteShare Job
- Suggest Revision
Site Reliability Engineer - Client Platform. Relevant experience in configuring and implementing modern open source endpoint management tooling such as: Puppet, Munki, Autopkg, NanoMDM/MicroMDM, Crypt.
ExpandApply NowActive JobUpdated Today - UpvoteDownvoteShare Job
- Suggest Revision
Provide on-call rotation to field issues and support issues as they may arise. Implement automation to mitigate risks and faults based on reactive and proactive measures. Collaborate with specific SMEs from various teams to investigate, troubleshoot, and resolve issues.
ExpandApply NowActive JobUpdated Today
reliability sre tooling jobs
FEATURED BLOG POSTS
Email Etiquette Principles - Why is it Important
Why is email etiquette important? Let's imagine you're hiring for a new role, and you’ve just received the email below.
10 Reasons HR is Important to an Organization
"Nothing we do is more important than hiring and developing people."
7 Importances of Organizational Culture and How to Build It
The world of work has drastically changed in the past few years. Where a good salary and a nice office might have been enough to attract talent in the past, employees today expect flexibility, growth opportunities, and a healthy work environment. In fact, 77% of applicants say they’d consider a company’s culture before applying for a job.
Collaborative Recruiting: The Key to a Better Talent Acquisition Strategy
Talent acquisition is a multi-stage process where candidates undergo various application steps before getting hired. The unfortunate reality is that it is a labor-intense system, with the hiring manager and recruiter often handling all of the work on their own. Ask any one of them, and you will hear about the overabundance of applications and the demanding task of filtering through them to find the best candidates. The quality of talent suffers under the weight of all that work on one person's hands. It's not easy, but as many companies are starting to realize, there is a better way. The future of talent acquisition lies in collaborative recruiting!
4 Talent Acquisition Trends Going Into 2023
For better or worse, a side effect of the COVID-19 pandemic was a marked shift in talent acquisition practices worldwide. With the struggle to retain talent that began in 2020, companies have had to rethink recruitment strategies. The result has been new talent acquisition trends that are well on their way to becoming commonplace. These are the practices that are going to become even more widespread going into 2023.
Why is Professionalism Important & How to Be Professional
You might have heard the word professionalism thrown around in the workplace, but do you know what it means? And do you know how to maintain professionalism no matter the circumstances?
Hiring Again After Mass Layoffs
It's never an easy decision to let members of your staff go, but depending on the state of your business, mass layoffs may have been the only way to survive. Now that you're months into the future, you may find yourself itching to start hiring again after previous layoffs.