Senior MLOps Engineer
Job Title/Role: Senior MLOps Technical Lead Location: Cupertino, CA/ Austin, TX Onsite MandatoryObjective:Build intelligent, data-driven platform. The focus is to support the development of next-generation test analytics and test agents that enable faster insights, improved diagnostics, and scalable infrastructure for Generative AI systems connecting test stations, line level data and pipelines . You will build automated evaluation tools, and conduct rigorous statistical analyses to ensure the reliability of both human and AI-based assessment systems.Benchmark, adapt, and integrate AI/ML models into existing software systems. Independently run and analyze ML experiments for real improvements.Must-Have Requirements/Requirement DetailsBackend/Systems Experience 3+ years building production backend or distributed systems (pre-AI experience required)Production AI Systems Has shipped AI/LLM features serving real users at scale — not just prototypes or demosAgentic Systems Has built AI agents, skills, tools, or MCP (Model Context Protocol) integrationsPython Proficient for backend developmentSecondary Language Working knowledge of Go, TypeScript, or RustCloud Infrastructure Deep experience with AWS/Google Cloud Platform/Azure — cost optimization, compute decisions, not just deploymentContainer & Orchestration Hands-on with Docker and Kubernetes — can build, deploy, debug, and scale services themselvesLLM Integration Understands token economics, context limits, rate limiting, structured outputs, API failure modesLLM Evaluation Understands how to evaluate LLM outputs and the inherent challenges (non-determinism, quality measurement, regression detection)Hands-On Engineer Not just an architect — writes code, debugs production issues, deploys their own work________________________________________Preferred / DifferentiatorsBuilt multi-step agentic workflows with tool use and function callingExperience with agent orchestration frameworks (LangGraph, CrewAI, or custom)Built guardrails, fallbacks, or graceful degradation for AI systemsStreaming inference and async agent orchestrationCost/latency optimization: caching, batching, prompt compressionML observability tools: Langfuse, Arize, Braintrust, W&BRetrieval systems (vector search, hybrid search) — as a tool, not the focusThanks & RegardsDarshan NeemaClient Account Manager, KTEK ResourcingO 832-260-0695 E W www.ktekresourcing.comA 2277 Plaza Dr. Suite 240, Sugar Land, TX 77479