Research Engineer - Reinforcement Learning (RL) Systems & Infrastructure (Seed Infra)

computer and information research scientists

mathematical science occupations all other

computer systems design and related services colleges universities and professional schools wired and wireless telecommunications except satellite all other telecommunications business schools and computer and management training

San Jose, CA

April 5th, 2026

About the Team The Seed Infrastructures team oversees the distributed training, reinforcement learning framework, high-performance inference, and heterogeneous hardware compilation technologies for AI foundation models. Responsibilities - Design and build end-to-end reinforcement learning (RL) systems for large-scale models, covering rollout, training, evaluation, and deployment pipelines. - Develop scalable and fault-tolerant RL infrastructure that operates efficiently under dynamic workloads and heterogeneous compute environments. - Optimize distributed training performance across GPU clusters, improving throughput, resource utilization, and system stability. - Collaborate with cross-team researchers on targeted system-algorithm co-design to translate research ideas into robust, production-grade implementations. - Build tooling, monitoring, and debugging frameworks to ensure reliability and observability of large-scale RL training systems.Minimum Qualifications: - Strong background in distributed systems, large-scale ML systems, or deep learning infrastructure - Experience building or optimizing large-scale training systems (e.g., RL, LLM, multimodal models) - Solid engineering skills in Python/C++ and familiarity with modern ML stacks (PyTorch, distributed training frameworks, etc.) - Experience with GPU optimization, parallelism strategies, and system-level performance tuning - Understanding of reinforcement learning workflows (rollout, policy update, evaluation loops) Preferred Qualifications: - Experience with large-scale agent systems - Familiarity with system design under heterogeneous or dynamic workloads - Exposure to RL + LLM training or post-training pipelines

371 matching similar jobs near San Jose, CA

1 2 3 4 8