<Back to Search
Ceph Cluster Development Engineer (C++ Focus)
Santa Clara, CAMarch 26th, 2026
We are seeking a highly skilled Ceph Cluster Development & Operations Engineer with strong expertise in C++ systems programming to design, extend, and maintain enterprise-scale Ceph distributed storage clusters. The role involves deep development in Ceph core subsystems (RADOS, OSD, RGW, MDS), performance optimization, and operational excellence across multi-site, multi-zone architectures.You will work closely with system architects, SREs, and cloud infrastructure teams to ensure the reliability, scalability, and security of mission-critical storage systems deployed across multiple data centers and Kubernetes environments.Key ResponsibilitiesDesign, build, and operate large-scale Ceph clusters including RADOS, RGW, RBDContribute to or extend Ceph core components written in C++ (e.g., OSD, RGW, librados, BlueStore, MGR modules).Profile and optimize performance across network, disk I/O, and replication layers (PG placement, CRUSH rules, BlueStore tuning).Develop automation and tooling for cluster lifecycle management (deployment, upgrades, scaling, failover, and recovery).Integrate Ceph with Kubernetes (via Rook-Ceph, CSI drivers) and CI/CD pipelines for continuous delivery.Implement and validate multi-site replication and disaster recovery architectures for high availability.Develop and maintain secure storage solutions using dm-crypt, KMS integration, and CephX authentication.Build observability pipelines using Prometheus, Grafana, and custom exporters for metrics and health analytics.Write and maintain SOPs, automation scripts, and system documentation to support production-grade operations.Collaborate with upstream Ceph community or maintain in-house forks for feature development and bug fixes.Qualifications Required Skills Strong proficiency in C++ (C++11 or later), with experience in large-scale distributed systems or kernel-adjacent development.Deep understanding of Ceph architecture and its core components: MON, OSD, MGR, RGW, MDS, and CRUSH maps.Proficient in Linux systems programming, debugging (gdb, perf, valgrind), and performance profiling.Experience with Python or Go for tooling and automation.Strong foundation in data replication, erasure coding, and consistency models in distributed storage.Hands-on experience with Kubernetes, Rook-Ceph, Helm, Ansible, and related DevOps tools.Familiarity with TCP/IP, HTTP/S3 APIs, block storage (RBD/iSCSI), and object storage semantics.Ability to conduct root-cause analysis and lead performance investigations under production environments. Preferred Skills Contributions to the Ceph open-source project or prior experience modifying Ceph source code.Experience with multi-site replication, object versioning, compliance retention, or legal hold features.Background in distributed storage systems, file systems, or cloud storage platforms.Familiarity with containerized environments, network virtualization, and cloud-native observability stacks.Excellent technical documentation and communication skills in English.The US base salary range for this full-time position is $179,000-$219,000. Fortinet offers employees a variety of benefits, including medical, dental, vision, life and disability insurance, 401(k), 11 paid holidays, vacation time, and sick time, as well as a comprehensive leave program.Wage ranges are based on various factors, including the labour market, job type, and job level. Exact salary offers will be determined by factors such as the candidate's subject knowledge, skill level, qualifications, experience, and geographic location.All roles are eligible to participate in the Fortinet equity program. Bonus eligibility is reviewed at the time of hire and annually at the Company's discretion.Why Join Us:We encourage candidates from all backgrounds and identities to apply. We offer a supportive work environment and a competitive Total Rewards package to support you with your overall health and financial well-being.Embark on a challenging, enjoyable, and rewarding career journey with Fortinet. Join us in bringing solutions that make a meaningful and lasting impact to our 660,000+ customers around the globe.
Showing 50 of 17,429 matching similar jobs in Springbrook, ND
- Principal Azure Cloud Infrastructure Engineer
- Kafka Platform EngineerEnglewood, COMarch 20th, 2026
- Cloud DB Administrator: Azure/SQL, HA & Security (Remote)RemoteMarch 27th, 2026
- Principal Azure Cloud Infrastructure Engineer
- Principal Cloud Engineer
- Principal Azure Cloud Infrastructure Engineer
- Principal Cloud Services Architect
- Principal Azure Cloud Infrastructure Engineer
- Senior Systems Engineer - Openshift / AKS
- Senior Software Engineer (Platform)
- Site Reliability Engineer [Hybrid]
- Cloud Infrastructure Engineer
- Job Posting Title Oracle Cloud Infrastructure GCP Architecture and EngineeringManager
- Google Cloud Platform Solution Leader
- Platform Engineer - Site Reliability Engineering
- Cloud Systems Engineer
- Platform Engineer, Information Technology
- Cloud Engineer (Containers Focus)
- Google Cloud Platform Solution Leader
- Platform Engineer - Clearance Required
- Cloud and Platform Engineering Manager
- Cloud Infrastructure Engineer
- Exadata Cloud Engineer: OCI Migrations & Tuning
- Remote Oracle Cloud DB Architect for Mission Platform
- Staff Site Reliability Engineer
- Senior Technology Site Reliability Engineer
- Senior Technology Site Reliability Engineer
- Senior Technology Site Reliability Engineer
- Staff Cloud Development Engineer-1
- Vice President, Cloud Platform Engineer - Cloud Modernization
- Senior Technology Site Reliability Engineer
- Senior Technology Site Reliability Engineer
- Mid-Senior Azure Cloud-Native Platform Architect
- Senior Site Reliability Engineer Cloud Platform
- Staff Developer Relations Engineer, Cloud Platform Evaluations Team
- DevOps Intern: Cloud, CI/CD & Automation
- Senior Platform Engineer
- Senior Site Reliability Engineer, Identity Platform
- Cloud Infra Engineer: Chef, AMI & Terraform
- Senior Cloud Engineer