JOBSEARCHER

AI/ML Engineer — LLM & Agent Stack

TruefoundrySan Mateo, CAApril 12th, 2026
Role summaryYou’ll be an individual contributor embedding modern LLM applications into customer workloads and platform features. You’ll work closely with senior engineers and products to implement production-grade RAG pipelines, prompt/chain design, small agent runtimes (LangGraph/LangChain), vector DB integrations, and monitoring instrumentation on TrueFoundry. This role is hands-on and ideal if you enjoy shipping code, learning the latest agent tools, and turning prototypes into repeatable engineering patterns.What you’ll doImplement, test, and maintain LLM-powered features and AI Agent / RAG pipelines (prompting, retrieval, vector DB + embeddings).Build and extend agent workflows using LangGraph / LangChain or equivalent frameworks; help harden state persistence and retry logic.Integrate models and runtimes via the platform’s API (deploy/serve/instrument LLMs, configure token/cost guards).Write end-to-end tests, small services, and automation to reproduce customer issues and demo solutions.Instrument observability: logs, traces, latency/cost dashboards and basic alerting for LLM workloads.Collaborate with product, support, and customers to convert POCs into documented, repeatable patterns.Must-have2–3 years software engineering experience building backend services or ML infra; comfortable with Python (and one other language).Practical experience using LLMs (OpenAI/Anthropic/other) and building prompt + retrieval workflows.Familiarity with at least one vector DB (e.g., Chroma, Pinecone, Weaviate) and embeddings pipelines.Experience with REST/gRPC APIs, containers (Docker), and basic Kubernetes concepts.Strong debugging skills and ability to write clean, testable code.Nice-to-haveHands-on with LangChain or LangGraph and agent architectures.Experience with RAG evaluation, prompt engineering best practices, or prompt-testing frameworks.Exposure to production monitoring for LLMs (token usage, cost controls, latency SLAs).Prior experience deploying to or operating on Kubernetes.Qualifications & signals we likeBS/MS in CS or related field (or equivalent industry experience).Public repo or demo showing an LLM project, small agent, or RAG pipeline.Curiosity about LLM safety, reliability, and cost-efficient deployment.