ML architect

nextgenprosSan Jose, CAMay 20th, 2026

Occupations:

Computer Systems Engineers/ArchitectsData ScientistsSoftware DevelopersDatabase ArchitectsComputer and Information Research Scientists

Industries:

Web Search Portals, Libraries, Archives, and Other Information ServicesComputer Systems Design and Related ServicesEmployment ServicesEducational Support ServicesSpecialized Design Services

Title: Machine Learning ArchitectLocation: San Jose, CA (Hybrid)Long Term Contract On C2CBasically we need an senior Architect/Lead-level AI/ML resource with strong expertise in Applied NLP, Machine Learning, and Data Science to design and build scalable enterprise-grade query understanding and intelligent routing solutions. The ideal candidate should have hands-on experience in recommendation systems, forecasting, entity extraction, semantic retrieval, and low-latency ML systems, along with exposure to LLM fine-tuning, RAG, and modern AI architectures. This role requires both deep technical capability and architectural leadership to drive current implementation needs as well as future AI/LLM initiatives. ResponsibilitiesDesign and implement a query understanding pipeline to extract intent, routing decisions, entities, application mapping, and historical evidence from user queries and conversations.Define and build the training data model and annotation schema for structured outputs (intent, routing, entities, applications, evidence).Lead data collection, synthesis, analysis, and cleaning to develop high-quality datasets for model training and evaluation.Develop and evaluate baseline and advanced non-LLM models for:Intent classificationQuery routingEntity extractionApplication detectionEvidence retrievalDesign and implement advanced ML/Data Science solutions for enterprise use cases, including:Recommendation systemsForecasting and predictive analyticsBehavioral and usage pattern analysisLead experimentation and implementation of LLM-based solutions, including:Fine-tuning and optimization of foundation modelsPrompt engineering and retrieval augmentation strategiesEvaluation and benchmarking of LLM performance for enterprise workflowsBuild and maintain train, test, and evaluation pipelines with strong focus on:Accuracy and F1 scoreConfidence scoring and calibrationLatency and throughputOptimize models to meet strict constraints:Sub-second inference latencyCPU-only executionCompact model size (<500MB)Deploy models locally within the application codebase, ensuring seamless integration without reliance on hosted AI services.Design and implement a Level 4 MLOps framework, including:Monitoring and alertingDrift detectionRetraining pipelinesData feedback loopsDevelop strategies to handle domain evolution, including:New agents / skillsNew entity typesUpdates to domain definitionsLeverage historical queries and routing decisions to improve prediction accuracy and evidence generation.Collaborate with product, engineering, and domain teams to translate business workflows into scalable ML solutions.Provide technical leadership and architectural guidance across ML, NLP, and AI initiatives, mentoring engineers and driving scalable enterprise-grade AI solution design.Deliver a working demo / prototype baseline, and iteratively mature it into a production-ready system.Required SkillsStrong expertise in Machine Learning and Applied NLP, especially in:Text classificationIntent detectionQuery routingEntity extractionSemantic similarity and retrievalStrong Data Science background with hands-on experience in:Recommendation systemsForecasting and predictive modelingStatistical analysis and feature engineeringTime-series analysis and predictive analyticsProven experience with non-LLM approaches, including:Encoder-based modelsEmbedding-based pipelinesClassical ML (e.g., XGBoost, Logistic Regression)Lightweight deep learning modelsHands-on experience with LLM technologies, including:Fine-tuning open-source or enterprise LLMsRetrieval-Augmented Generation (RAG)Prompt engineeringModel evaluation and optimizationUnderstanding of transformer architectures and embedding modelsExperience designing training datasets, labeling frameworks, and structured output schemas for multi-task NLP systems.Strong understanding of data preprocessing and quality improvement, including:NormalizationDeduplicationClass imbalance handlingSynthetic data generation

ML architect

matching similar jobs near San Jose, CA