AI/ML Engineer — LLM & Agent Stack
Role summaryYou’ll be an individual contributor embedding modern LLM applications into customer workloads and platform features. You’ll work closely with senior engineers and products to implement production-grade RAG pipelines, prompt/chain design, small agent runtimes (LangGraph/LangChain), vector DB integrations, and monitoring instrumentation on TrueFoundry. This role is hands-on and ideal if you enjoy shipping code, learning the latest agent tools, and turning prototypes into repeatable engineering patterns.What you’ll doImplement, test, and maintain LLM-powered features and AI Agent / RAG pipelines (prompting, retrieval, vector DB + embeddings).Build and extend agent workflows using LangGraph / LangChain or equivalent frameworks; help harden state persistence and retry logic.Integrate models and runtimes via the platform’s API (deploy/serve/instrument LLMs, configure token/cost guards).Write end-to-end tests, small services, and automation to reproduce customer issues and demo solutions.Instrument observability: logs, traces, latency/cost dashboards and basic alerting for LLM workloads.Collaborate with product, support, and customers to convert POCs into documented, repeatable patterns.Must-have2–3 years software engineering experience building backend services or ML infra; comfortable with Python (and one other language).Practical experience using LLMs (OpenAI/Anthropic/other) and building prompt + retrieval workflows.Familiarity with at least one vector DB (e.g., Chroma, Pinecone, Weaviate) and embeddings pipelines.Experience with REST/gRPC APIs, containers (Docker), and basic Kubernetes concepts.Strong debugging skills and ability to write clean, testable code.Nice-to-haveHands-on with LangChain or LangGraph and agent architectures.Experience with RAG evaluation, prompt engineering best practices, or prompt-testing frameworks.Exposure to production monitoring for LLMs (token usage, cost controls, latency SLAs).Prior experience deploying to or operating on Kubernetes.Qualifications & signals we likeBS/MS in CS or related field (or equivalent industry experience).Public repo or demo showing an LLM project, small agent, or RAG pipeline.Curiosity about LLM safety, reliability, and cost-efficient deployment.