<Back to Search
Ceph Cluster Development Engineer (C++ Focus)
Santa Clara, CAMarch 26th, 2026
We are seeking a highly skilled Ceph Cluster Development & Operations Engineer with strong expertise in C++ systems programming to design, extend, and maintain enterprise-scale Ceph distributed storage clusters. The role involves deep development in Ceph core subsystems (RADOS, OSD, RGW, MDS), performance optimization, and operational excellence across multi-site, multi-zone architectures.You will work closely with system architects, SREs, and cloud infrastructure teams to ensure the reliability, scalability, and security of mission-critical storage systems deployed across multiple data centers and Kubernetes environments.Key ResponsibilitiesDesign, build, and operate large-scale Ceph clusters including RADOS, RGW, RBDContribute to or extend Ceph core components written in C++ (e.g., OSD, RGW, librados, BlueStore, MGR modules).Profile and optimize performance across network, disk I/O, and replication layers (PG placement, CRUSH rules, BlueStore tuning).Develop automation and tooling for cluster lifecycle management (deployment, upgrades, scaling, failover, and recovery).Integrate Ceph with Kubernetes (via Rook-Ceph, CSI drivers) and CI/CD pipelines for continuous delivery.Implement and validate multi-site replication and disaster recovery architectures for high availability.Develop and maintain secure storage solutions using dm-crypt, KMS integration, and CephX authentication.Build observability pipelines using Prometheus, Grafana, and custom exporters for metrics and health analytics.Write and maintain SOPs, automation scripts, and system documentation to support production-grade operations.Collaborate with upstream Ceph community or maintain in-house forks for feature development and bug fixes.Qualifications Required Skills Strong proficiency in C++ (C++11 or later), with experience in large-scale distributed systems or kernel-adjacent development.Deep understanding of Ceph architecture and its core components: MON, OSD, MGR, RGW, MDS, and CRUSH maps.Proficient in Linux systems programming, debugging (gdb, perf, valgrind), and performance profiling.Experience with Python or Go for tooling and automation.Strong foundation in data replication, erasure coding, and consistency models in distributed storage.Hands-on experience with Kubernetes, Rook-Ceph, Helm, Ansible, and related DevOps tools.Familiarity with TCP/IP, HTTP/S3 APIs, block storage (RBD/iSCSI), and object storage semantics.Ability to conduct root-cause analysis and lead performance investigations under production environments. Preferred Skills Contributions to the Ceph open-source project or prior experience modifying Ceph source code.Experience with multi-site replication, object versioning, compliance retention, or legal hold features.Background in distributed storage systems, file systems, or cloud storage platforms.Familiarity with containerized environments, network virtualization, and cloud-native observability stacks.Excellent technical documentation and communication skills in English.The US base salary range for this full-time position is $179,000-$219,000. Fortinet offers employees a variety of benefits, including medical, dental, vision, life and disability insurance, 401(k), 11 paid holidays, vacation time, and sick time, as well as a comprehensive leave program.Wage ranges are based on various factors, including the labour market, job type, and job level. Exact salary offers will be determined by factors such as the candidate's subject knowledge, skill level, qualifications, experience, and geographic location.All roles are eligible to participate in the Fortinet equity program. Bonus eligibility is reviewed at the time of hire and annually at the Company's discretion.Why Join Us:We encourage candidates from all backgrounds and identities to apply. We offer a supportive work environment and a competitive Total Rewards package to support you with your overall health and financial well-being.Embark on a challenging, enjoyable, and rewarding career journey with Fortinet. Join us in bringing solutions that make a meaningful and lasting impact to our 660,000+ customers around the globe.
Showing 250 of 11,270 matching similar jobs in Springbrook, ND
- Cloud Architect
- Cloud Engineer
- Systems Administrator (COF)
- Sr. IT Infrastructure Engineer
- Hybrid Cloud Engineer (Remote)
- Azure Virtual Desktop Engineer
- Systems Administrator
- Lead System and Database Admin
- Cloud Engineer
- Sr AWS and Infrastructure Engineer
- Sr. Solutions Architect, AWS-Cloud Migration (Remote in USA)
- Windows Systems Administrator III - Team Lead
- Cloud Infrastructure Engineer - AWS & ML Tooling
- Cloud Infrastructure Engineer - AWS & ML Tooling
- Cloud Infrastructure Engineer - AWS & ML Tooling
- Infrastructure Engineer IV
- AWS & Azure Cloud Architect
- VMware Systems Administrator - Classified Environment
- AWS Cloud Engineer
- Azure SME
- ITSM Problem Manager
- Cloud Architect
- Software Architect, Reliability Engineering
- Sr SQL Database Administrator
- PRODUCTMANAGER, Data Platform
- DevOps Engineer, Infrastructure & Security
- Sr Database Administrator (Postgres, AWS)
- AI Infrastructure Site Reliability Engineer (remote USA)
- Lead Linux Systems Administrator
- Data Transport Sr NiFi Engineer
- Data Transport Infrastructure Lead
- Data Engineer I
- Senior Principal - Kubernetes Platform Engineer
- Senior Data Platform Engineer — Scalable AWS Data Pipelines
- Site Reliability Engineer (SRE) - AI Infrastructure
- Senior Software Engineer – Managed Kubernetes & Cloud Infra
- Atlassian Platform Lead - Jira/Confluence (Remote)RemoteMarch 26th, 2026
- Remote Tech Support Engineer – Cloud Security (West Coast)RemoteMarch 26th, 2026
- Senior Software Engineer - Cloud Platform Infrastructure (Remote, Americas)RemoteMarch 26th, 2026
- Cloud Operations Engineer - Remote DevOps & IaC