<Back to Search
Tech Lead, Research Scientist/Engineer - AI Infrastructure
San Jose, CAMarch 31st, 2026
We are seeking a Tech Lead to provide technical stewardship in defining and building the next generation of AI infrastructure. You will help build the technical roadmap at the intersection of AI models, software systems, and emerging hardware, architecting the infrastructures that ensure reliable, efficient, and scalable AI at ByteDance. You will work closely with tech leaders, architects, and product teams to translate evolving AI requirements into robust infrastructure architectures. The role involves identifying emerging trends in AI algorithms and systems, designing scalable system architectures, and driving innovations that improve performance, reliability, and cost efficiency across the AI stack. Responsibilities AI Infrastructure Architecture Design and evaluate scalable infrastructure architectures for large-scale ML workloads across compute, storage, and networking. Develop technical proposals and specifications that guide next-generation AI infrastructure systems. Research & Technology Exploration Track emerging trends in AI systems, distributed computing, and hardware acceleration. Conduct technical investigations and prototypes, and share insights through technical reports and presentations. Performance & System Optimization Analyze and optimize performance across the ML infrastructure stack-including scheduling, networking, storage, and training frameworks-through benchmarking, experimentation, and bottleneck analysis. Cross-Team Technical Alignment Work across research and engineering teams to translate AI workload requirements into scalable infrastructure solutions, providing architectural guidance and driving cross-team technical initiatives.Required Qualifications - Master's degree or PhD in Computer Science, Electrical Engineering, or a related technical field. - Strong proficiency in integrating AI tools into knowledge discovery and research workflows. - 5 years of experience in distributed systems, infrastructure engineering, or ML systems. Experienced at evaluating trade-offs across hardware, software, and algorithms. - Excellent communication skills to collaborate across teams. Preferred Qualifications - Experience with large-scale model training and inference, including distributed training, KV cache-aware serving, GPU/accelerator optimization, and high-performance networking (e.g., RDMA, NCCL). - Experience with heterogeneous AI compute systems, large-scale training clusters, HPC-style distributed workloads, and data pipelines for large model training and evaluation. - Publications in systems and/or machine learning conferences (e.g., NeurIPS, OSDI, SOSP, ASPLOS, MLSys). - Contributions to open-source projects.
Showing all 356 matching similar jobs
- Senior Staff Software Engineer - GPU & AI Performance
- Strategic AI Systems Lead: Scalable ML & Cloud Production
- Staff Autonomy Engineer
- ConvergeHEALTH - R&D Life Sciences, Product Owner - Innovation_Delivery_Transformation
- Sr. Product Manager - AI Innovation
- Research Engineer, TikTok AI Search (LLM Pretraining/Alignment/Inference)
- Sr Manager, Machine Learning Engineering, Firefly Services
- Research Engineer Graduate (Ads ML Infrastructure) - 2026 Start (PhD)
- Staff Autonomy Engineer
- Multimodal AI Algorithm Expert-EMG / Interaction Perception, PICO
- Lead AI Engineer
- Staff Research Scientist/Engineer, ML Recommendation Systems, Applied Machine Learning Team
- Applied ML - Functional Verification Engineer
- HPE Labs - Research Engineer III
- Tech Lead, Senior Machine Learning Engineer - TikTok Search Algorithms (NLP, Ranking, Relevance, Understanding, User Engagement)
- Distinguished AI Researcher
- Frontend Engineer - Machine Learning Platform San Jose Regular
- Machine Learning Engineer - Inference San Jose Regular
- Staff Machine Learning Engineer, LLM Fine‐Tuning (Verilog/RTL Applications)
- Lead Customer Facing Applied AI Engineer
- Machine Learning Engineer Graduate (Monetization Technology - TikTok Ads Creative & Ecosystem) [...]
- Director of Generative AI & Foundation Models
- Senior ML Scientist, Search & Personalizationexpedia incweb search portals libraries archives and other information servicescomputing infrastructure providers data processing web hosting and related servicescontinuing care retirement communities and assisted living facilities for the elderlymedia streaming distribution services social networks and other media networks and content providerscomputer systems design and related servicesSan Jose, CAMarch 31st, 2026
- Staff ML Engineer: Personalization & Growth
- Senior ML Inference Engineer - Distributed Systems & Equity
- (General Hire) Machine Learning Engineer, Data-Search - San Jose - 2026 Start (PhD)tiktokweb search portals libraries archives and other information servicesbusiness schools and computer and management trainingmedia streaming distribution services social networks and other media networks and content providerselementary and secondary schoolsmanagement scientific and technical consulting servicesSan Jose, CAApril 1st, 2026
- Director of Software Engineering & Applied AI (Level 6) – Product Success Engineering, Adobe Ex[...]
- Cisco is Seeking Machine Learning Engineer – AI Research
- Lead Data Scientist
- Machine Learning Engineer - Inference San Jose Regular
- Staff Machine Learning Engineer/Architect – Personalization, Adobe Experience Platform (AEP)
- Machine Learning Engineer Graduate (TikTok Short Video Content Understanding/Multimodal Recomme[...]
- Manager, Data Science (Recommender Systems)
- Senior AI Solutions Architect - MDR Platform
- Staff AI Solutions Architect
- Lead, AI-Driven TV Search & Growth
- Lead AI/ML Architect: Industrial & Automotive Pipelines
- AI-Powered XOps Observability Product Manager (Remote)
- Sr. Technical Product ManagerSan Jose, CAMarch 31st, 2026
- Staff Product Manager