<Back to Search
Machine Learning Architect - Conversational Speech
Cupertino, CAApril 2nd, 2026
**Weekly Hours:** 40**Role Number:** 200653966-0836**Summary**The Speech organization within Siri is at the forefront of building the technologies that power conversational AI, speech recognition, speech synthesis, and speech-to-speech experiences across Apple's entire ecosystem. Our mission is to develop cutting-edge models, infrastructure, and datasets that enable Siri, dictation, and Apple Intelligence features to deliver natural, intelligent, and deeply personalized speech interactions for billions of users worldwide.We are seeking a Machine Learning Architect to serve as a senior technical leader spanning the full Speech organization. In this role, you will set the future modeling direction for all of conversational speech-charting the architectural and algorithmic course for how Apple's speech technologies evolve over the coming years. You will operate as a hands-on expert who not only defines strategy but also digs into the hardest technical problems, working shoulder-to-shoulder with teams to overcome critical obstacles and unlock breakthroughs. Reporting directly to the Speech organization leadership, you will have broad visibility and influence across speech recognition, synthesis, dialog, multimodal foundation models, and speech-to-speech systems, ensuring coherent technical vision and cross-team alignment.We believe the most impactful advances in deep learning emerge when world-class research is anchored in real-world production needs at scale. This role offers a rare opportunity to shape the trajectory of conversational speech technology across Apple's software, hardware, and services - improving speech interaction experiences for Apple's customers around the world.**Description**As the Machine Learning Architect for Conversational Speech, you will:Define modeling strategy and technical direction across the Speech organization, establishing a unified architectural vision for speech recognition, speech synthesis, dialog systems, multimodal foundation models, and speech-to-speech technologies.Serve as the organization's foremost modeling expert, providing deep technical guidance and mentorship to multiple teams of researchers and engineers working on distinct but interconnected speech capabilities.Identify and drive solutions to the most challenging technical problems-rolling up your sleeves to prototype, debug, and iterate on novel approaches when teams encounter critical obstacles.Evaluate emerging research and industry trends (e.g., advances in large language models, multimodal architectures, full-duplex natural conversational systems) and translate them into actionable roadmaps aligned with Apple's product and platform priorities.Drive cross-team technical alignment, ensuring that modeling choices, training methodologies, data strategies, and infrastructure investments are coherent and mutually reinforcing across the organization.Champion production-readiness, bridging the gap between research innovation and deployed systems by ensuring that architectural decisions account for on-device constraints, latency requirements, scalability, robustness, and real-world data conditions.Collaborate broadly with partner teams across Siri, Apple Intelligence, hardware, and platform engineering to ensure speech modeling investments are well-integrated into Apple's broader AI and product strategy.Contribute to the broader ML and speech research community through publications, patents, and engagement with the state of the art.**Minimum Qualifications**+ 10+ years of experience in machine learning applied to speech or multimodal systems, with progressively increasing technical scope and leadership.+ Demonstrated expertise as a technical leader or architect who has defined modeling direction across multiple teams or product areas-not solely an individual contributor on a single workstream.+ Deep, hands-on proficiency in modern deep learning, including large language models and end-to-end speech systems.+ Significant experience with multimodal LLMs, including architecture design, training, adaptation, and deployment of models that integrate speech, audio, and text modalities.+ Direct experience building speech-to-speech conversational systems, with a strong understanding of full-duplex natural conversational interaction and end-to-end speech pipelines.+ A track record of translating research into production-quality systems at scale.+ Expert programming skills in Python and deep learning frameworks such as PyTorch, JAX, or TensorFlow.+ Proven ability to diagnose and resolve complex, cross-cutting technical challenges spanning model architecture, training methodology, data quality, and systems integration.**Preferred Qualifications**+ Ph.D. in Computer Science, Electrical Engineering, Machine Learning, or a closely related field.+ Experience architecting or leading development of full-duplex natural conversational systems, speech-to-speech models, or multimodal foundation models that have shipped to large-scale user populations.+ Deep familiarity with the full stack of speech technologies-ASR, TTS, spoken dialog, speaker modeling, audio understanding-and an ability to reason about their interactions and dependencies.+ Experience with large-scale distributed training and the infrastructure considerations that shape model design at scale.+ A data-centric perspective on foundation model development, including experience guiding data collection, curation, annotation, and quality strategies.+ Track record of influencing technical direction across organizational boundaries, including the ability to build consensus, communicate complex trade-offs clearly, and drive alignment among diverse stakeholders.+ Experience with on-device ML deployment, including model compression, quantization, and latency-aware architecture design.+ Demonstrated ability to mentor and elevate senior technical talent, raising the bar for modeling excellence across an organization.Apple is an equal opportunity employer that is committed to inclusion and diversity. We seek to promote equal opportunity for all applicants without regard to race, color, religion, sex, sexual orientation, gender identity, national origin, disability, Veteran status, or other legally protected characteristics. Learn more about your EEO rights as an applicant (https://www.eeoc.gov/sites/default/files/2023-06/22-088\_EEOC\_KnowYourRights6.12ScreenRdr.pdf) .
1,172 matching similar jobs near Cupertino, CA
- Staff Software Engineer, Embedded
- Staff Software Engineer, Routing
- Firmware Engineer
- Senior Firmware Engineer
- Senior Software Engineer, Planning
- Software Engineer, AI Platform - New Grad
- Senior Staff Software Engineer, Model LifeCycle
- Senior/Staff Software Engineer, ML Infrastructure, Optimization
- Senior Software Engineer, Teleoperation
- Software Engineer, Applied AI
- Software Engineer, Onboard Infrastructure
- Software Engineer, AI Platform - Intern
- Sr. Embedded S/W/F/W Engineer: ETH PHY/MAC Layer
- AI Development Intern, 2026
- Senior Software Engineer, Middleware
- Firmware Engineer - All Levels
- Senior Software Engineer, Networking & Real-Time Systems
- Staff/Senior Software Engineer, Onboard Infrastructure
- Senior Software Engineer II, AI Workload Orchestration
- Software Engineer, Computational Geometry (Autonomy)
- Firmware Engineer - DSP Focus
- Senior Software Development
- LLM Research Engineer
- Principal Engineer, AI Model LifeCycle
- Robotics Integration Engineer
- Senior Digital Design Engineer
- Senior/Staff Software Engineer, ML Data
- Staff Software Engineer - Data Team (Durham, NC) #4433
- Principal Research Scientist- End-to-End Autonomous Systems
- Lead AI/ML Engineer
- Machine Learning Research Scientist: Generative Modeling for Planning
- Senior Machine Learning Infrastructure Engineer
- Research Scientist - Reinforcement Learning
- Lead Machine Learning Engineer
- Senior Software Engineer, Infrastructure
- Control engineer
- Distributed Machine Learning Engineer
- Research Scientist - Computer Vision
- Lead ML Research Scientist
- AI/ML Architect