<Back to Search
Ceph Cluster Development Engineer (C++ Focus)
Santa Clara, CAMarch 26th, 2026
We are seeking a highly skilled Ceph Cluster Development & Operations Engineer with strong expertise in C++ systems programming to design, extend, and maintain enterprise-scale Ceph distributed storage clusters. The role involves deep development in Ceph core subsystems (RADOS, OSD, RGW, MDS), performance optimization, and operational excellence across multi-site, multi-zone architectures.You will work closely with system architects, SREs, and cloud infrastructure teams to ensure the reliability, scalability, and security of mission-critical storage systems deployed across multiple data centers and Kubernetes environments.Key ResponsibilitiesDesign, build, and operate large-scale Ceph clusters including RADOS, RGW, RBDContribute to or extend Ceph core components written in C++ (e.g., OSD, RGW, librados, BlueStore, MGR modules).Profile and optimize performance across network, disk I/O, and replication layers (PG placement, CRUSH rules, BlueStore tuning).Develop automation and tooling for cluster lifecycle management (deployment, upgrades, scaling, failover, and recovery).Integrate Ceph with Kubernetes (via Rook-Ceph, CSI drivers) and CI/CD pipelines for continuous delivery.Implement and validate multi-site replication and disaster recovery architectures for high availability.Develop and maintain secure storage solutions using dm-crypt, KMS integration, and CephX authentication.Build observability pipelines using Prometheus, Grafana, and custom exporters for metrics and health analytics.Write and maintain SOPs, automation scripts, and system documentation to support production-grade operations.Collaborate with upstream Ceph community or maintain in-house forks for feature development and bug fixes.Qualifications Required Skills Strong proficiency in C++ (C++11 or later), with experience in large-scale distributed systems or kernel-adjacent development.Deep understanding of Ceph architecture and its core components: MON, OSD, MGR, RGW, MDS, and CRUSH maps.Proficient in Linux systems programming, debugging (gdb, perf, valgrind), and performance profiling.Experience with Python or Go for tooling and automation.Strong foundation in data replication, erasure coding, and consistency models in distributed storage.Hands-on experience with Kubernetes, Rook-Ceph, Helm, Ansible, and related DevOps tools.Familiarity with TCP/IP, HTTP/S3 APIs, block storage (RBD/iSCSI), and object storage semantics.Ability to conduct root-cause analysis and lead performance investigations under production environments. Preferred Skills Contributions to the Ceph open-source project or prior experience modifying Ceph source code.Experience with multi-site replication, object versioning, compliance retention, or legal hold features.Background in distributed storage systems, file systems, or cloud storage platforms.Familiarity with containerized environments, network virtualization, and cloud-native observability stacks.Excellent technical documentation and communication skills in English.The US base salary range for this full-time position is $179,000-$219,000. Fortinet offers employees a variety of benefits, including medical, dental, vision, life and disability insurance, 401(k), 11 paid holidays, vacation time, and sick time, as well as a comprehensive leave program.Wage ranges are based on various factors, including the labour market, job type, and job level. Exact salary offers will be determined by factors such as the candidate's subject knowledge, skill level, qualifications, experience, and geographic location.All roles are eligible to participate in the Fortinet equity program. Bonus eligibility is reviewed at the time of hire and annually at the Company's discretion.Why Join Us:We encourage candidates from all backgrounds and identities to apply. We offer a supportive work environment and a competitive Total Rewards package to support you with your overall health and financial well-being.Embark on a challenging, enjoyable, and rewarding career journey with Fortinet. Join us in bringing solutions that make a meaningful and lasting impact to our 660,000+ customers around the globe.
Showing 350 of 11,257 matching similar jobs in Springbrook, ND
- SRE
- Alibaba Cloud-SRE of Container Service-Bellevue
- Infrastructure Storage Engineer
- Senior Linux/Unix Administrator
- SRE
- SRE
- GCP SRE
- Systems Administrator
- Senior Platform Systems Engineer - AODS
- Senior Platform Systems Engineer - AODS
- Senior Platform Manager
- SRE
- Senior Platform Systems Engineer - AODS
- Senior Platform Systems Engineer - AODS
- Senior Platform Systems Engineer - AODS
- Senior Platform Systems Engineer - AODS
- DevOps Engineer (AWS / Kubernetes)
- Senior Platform Systems Engineer - AODS
- Product Reliability Engineer - Defense
- Business Systems Database Administrator (DBA)
- Principal Cloud Delivery Specialist
- Forward Deployed Engineer - Clearance Required
- Senior Datacenter and Cloud Engineer
- Site Reliability Engineering (SRE) Automation and Orchestration Engineer
- Kafka Platform Engineer
- Cloud Architect -Azure Platform
- Remote Data Architect: Cloud DB & API for Federal Projects
- Senior Java Backend Engineer - Cloud-Native, Kubernetes
- Senior Cloud Support Engineer
- Senior Cloud Support Engineer
- Senior Cloud Support Engineer
- Senior Cloud Architect - Defense & Aerospace - Melbourne, FL
- Cloud-Native Software Engineering Lead - Azure & APIs
- Software Engineer - Platform Infrastructure
- Lead Analytics Platform Engineer
- Technical Architect - DevOps, Cloud Infrastructure & CI/CD
- Senior Technology Development Operations Manager
- Application Developer Cognos BI
- SCI Cloud Engineer
- IT Systems Administrator