JOBSEARCHER
<Back to Search

Senior Software Development Engineer -Distributed KV Caching and Storage Systems

About the Team Join ByteDance's KV caching and storage systems team, where we build and own mission-critical distributed KV caching and storage products powering ByteDance's global infrastructure. Our portfolio includes Redis-compatible services, next-generation shared-storage engines, and performance/cost optimization components, along with a full ecosystem of operational automation, observability, data movement, and recovery capabilities. We serve ByteDance's core business scenarios at massive scale - recommendation, search, ads, e-commerce, messaging, live streaming, and collaboration suites - with strict requirements on availability, latency, throughput, global deployment, and cost efficiency. Responsibilities - Design and develop core KV caching and storage systems, including distributed caching systems and Redis-compatible KV storage systems, with a focus on low latency, high throughput, and high availability. - Build planet-scale reliability, leading or contributing to HA architecture, failure isolation, multi-AZ/multi-region disaster recovery, and large-scale stability engineering for always-on business workloads. - Drive compute/storage efficiency improvements (CPU, memory, IO, network), including cache hierarchy designs (memory/SSD), read/write amplification reductions, and capacity planning for billion-level request traffic. - Build a production-grade ecosystem, including automated orchestration operations (provisioning, scaling, placement, scheduling) and monitoring systems (tracing, profiling, incident response runbooks). - Implement and evolve capabilities such as Bulkload, backup & restore, point-in-time recovery, tiered storage, and integration with upstream/downstream data systems to enrich data ecosystems. - Research new hardware and new technologies, evaluate and land improvements using ZNS SSD, io_uring, RDMA/CXL, and "AI+DB" directions in production.Minimum Qualifications: - BS or a higher degree in Computer Science or related fields, or equivalent practical experience. - Proficiency in one or more programming languages (C, C++, Java, Go, Python, Rust) with strong coding skills in a Linux environment. - Solid fundamentals in distributed systems, database/storage principles, networking, and multi-threaded programming; strong debugging and performance analysis skills (profiling, tracing, flame graphs, lock contention, tail latency). - Hands-on experience building or operating large-scale distributed systems (high QPS, high concurrency, strict SLO/SLA), with proven ability to improve stability, performance, and cost. - Clear and logical thinking, coupled with a product-oriented mindset, self-driven initiative, and strong project management skills. Preferred Qualifications: - 3+ years in database internals/storage engine/cache system development, or equivalent large-scale infrastructure experience. - Familiarity with or contributions to systems such as Redis, Tair, MemoryDB, RocksDB, pika, TiDB, etc. - Strong knowledge of distributed consensus algorithms, with experience in database kernel development. - Experience with Linux kernel-level performance tuning, networking stack optimization, or IO subsystem. - Familiarity with RDMA, CXL, ZNS SSD, or modern storage hardware. - Interest or experience in applying AI techniques to database systems (e.g., cost modeling, workload prediction, auto-tuning).

Showing 700 of 20,840 matching similar jobs in Springbrook, ND

Senior Software Development Engineer -Distributed KV Caching and...