JOBSEARCHER

Member of Technical Staff - Kernels (San Francisco)

MTS - Kernel EngineerFull-time | San Francisco / On-site preferredAbout the CompanyWe are an AI research company focused on building safe, advanced AI systems that accelerate progress on important global problems. Our work centers on automating research and code generation to improve model capabilities and alignment more reliably than humans can alone.Our technical approach combines frontier-scale pre-training, domain-specific reinforcement learning, ultra-long context, and inference-time compute. These systems create unique infrastructure challenges across training, inference, memory utilization, and hardware efficiency.About the RoleAs a Kernel Engineer, you will design, implement, and maintain high-performance kernels that optimize throughput and latency during training and inference.The company's long-context workloads create distinct kernel optimization challenges around memory utilization, data movement, and sustained throughput. You will work close to the hardware and collaborate with training, inference, and RL teams to make large-scale model systems faster, more reliable, and more efficient.What You'll Work OnDesign and implement kernels that support high-performance long-context model behavior.Own kernel design, implementation, deployment, and production reliability.Balance robustness, extensive testing, and functional correctness with aggressive performance optimization.Evaluate porting compute kernels to alternative hardware platforms.Co-design kernels in collaboration with training, inference, and reinforcement learning teams.Work on kernel-level optimizations for frontier-scale AI systems, including memory movement, sustained throughput, and accelerator utilization.What We're Looking ForLow-level programming experience targeting AI accelerators, such as modern GPUs, TPUs, or similar accelerator architectures.Experience developing and optimizing GPU or accelerator kernels using frameworks such as collective communication libraries, template-based GPU programming libraries, DSLs, attention kernels, or comparable low-level performance frameworks.Experience with kernel-authoring frameworks for GPU or TPU platforms.Deep expertise in computer architecture, low-level machine optimization, memory systems, and code generation.Strong kernel engineering depth, with enough breadth across machine learning systems to understand training and inference workloads.Ownership mindset, agility, and persistence in solving difficult performance and reliability problems.Experience with things such as CUDA, Triton, ROCm, Halide, CUTLASS, etcBonus ExperienceExperience optimizing kernels for long-context training or inference workloads.Familiarity with attention kernel optimization, communication kernels, or distributed training performance.Experience with alternative accelerator backends beyond GPUs.Production experience deploying and maintaining high-performance kernels in large-scale ML systems.Strong testing and validation practices for numerical correctness and reliability.Compensation and BenefitsCompetitive salary and meaningful equity.Retirement plan with employer matching.Comprehensive health, dental, and vision coverage.Flexible paid time off.Relocation and visa support where possible.Small, focused, fast-moving technical team.CultureIntegrity: words and actions should be aligned.Hands-on execution: everyone contributes directly to building.Teamwork: the company moves as one team.Focus: prioritize the core mission above distractions.Quality: systems and products should feel exceptional.

matching similar jobs near Millbrae, CA

VIEW MORE