<Back to Search
Tech Lead, Research Scientist/Engineer - AI Infrastructure
San Jose, CAApril 2nd, 2026
We are seeking a Tech Lead to provide technical stewardship in defining and building the next generation of AI infrastructure. You will help build the technical roadmap at the intersection of AI models, software systems, and emerging hardware, architecting the infrastructures that ensure reliable, efficient, and scalable AI at ByteDance. You will work closely with tech leaders, architects, and product teams to translate evolving AI requirements into robust infrastructure architectures. The role involves identifying emerging trends in AI algorithms and systems, designing scalable system architectures, and driving innovations that improve performance, reliability, and cost efficiency across the AI stack. Responsibilities AI Infrastructure Architecture Design and evaluate scalable infrastructure architectures for large-scale ML workloads across compute, storage, and networking. Develop technical proposals and specifications that guide next-generation AI infrastructure systems. Research & Technology Exploration Track emerging trends in AI systems, distributed computing, and hardware acceleration. Conduct technical investigations and prototypes, and share insights through technical reports and presentations. Performance & System Optimization Analyze and optimize performance across the ML infrastructure stack-including scheduling, networking, storage, and training frameworks-through benchmarking, experimentation, and bottleneck analysis. Cross-Team Technical Alignment Work across research and engineering teams to translate AI workload requirements into scalable infrastructure solutions, providing architectural guidance and driving cross-team technical initiatives.Required Qualifications - Master's degree or PhD in Computer Science, Electrical Engineering, or a related technical field. - Strong proficiency in integrating AI tools into knowledge discovery and research workflows. - 5 years of experience in distributed systems, infrastructure engineering, or ML systems. Experienced at evaluating trade-offs across hardware, software, and algorithms. - Excellent communication skills to collaborate across teams. Preferred Qualifications - Experience with large-scale model training and inference, including distributed training, KV cache-aware serving, GPU/accelerator optimization, and high-performance networking (e.g., RDMA, NCCL). - Experience with heterogeneous AI compute systems, large-scale training clusters, HPC-style distributed workloads, and data pipelines for large model training and evaluation. - Publications in systems and/or machine learning conferences (e.g., NeurIPS, OSDI, SOSP, ASPLOS, MLSys). - Contributions to open-source projects.
413 matching similar jobs near San Jose, CA
- Autonomous Driving ML Engineer
- AI Agents ML Engineer - Production-Ready & Innovative
- Machine Learning Engineer, LLM Fine-Tuning
- Frontend Engineer - Machine Learning Platform San Jose Regular
- General Hire: Machine Learning Engineer Graduate (Data and Eng, USDS) - 2025 Start (MS)
- Director Software Development, AI Models and Research
- Principal Engineer AI/ML
- Senior Manager, Applied Science
- Manager, Applied Science
- Manager, Applied Science - Generative AI Data Research
- Technical Lead, Machine Learning Engineering (Multiple Positions)
- Sr. Staff Engineer, AI Models and Applications
- Senior Lead AI Engineer
- Sr. Lead AI Engineer
- Senior Lead AI Engineer (GenAI Platform)
- Research Scientist, Infrastructure System Lab
- Sr. System Engineer/Rack Solution (27693)
- Research Scientist (AI /ML Biologics)
- Sr. Manager, Applied Science - Generative AI Data Research
- Machine Learning Engineer, Adobe Firefly Services
- Go/Golang Developer - Ai Training
- AI Staff Machine Learning Engineer -Gen AI,Machine Learning,Graph ML,Big Data(10030)
- Sr. AI Systems Engineer
- Principal AI Engineer
- Senior Software Engineer (Generative AI/Machine Learning) (10029)
- AI/ML Engineer - RemoteSan Jose, CAApril 3rd, 2026
- Python Developer - Ai Training
- Generalist - Ai ResearchWriting/Editing
- Rust Developer - Ai Training
- New Grads 2026 - Data Engineer
- AI Engineer - AI Department
- Machine Learning Engineer - Remote
- Lead Product Manager - AI-Driven Bill Automation
- Sr. Product Marketing Manager
- Data Scientist - Kaggle Grandmaster
- Sr Engineer- Ad Tech
- Senior Product Manager, AI
- Senior AI Solutions Architect - MDR Platform
- Staff AI Solutions Architect
- Lead AI/ML Architect: Industrial & Automotive Pipelines