JOBSEARCHER

Principal AI Engineer

Company DescriptionVeritec AI develops AI-powered document processing solutions for legal and medical professionals. Our flagship products—FileFlow and LitHub—help organizations transform complex documents into actionable intelligence. We're a growing B2B SaaS company solving real problems in regulated industries where accuracy and reliability are paramount.About the RoleWe're looking for a Principal AI Engineer who lives and breathes applied AI. You've shipped real LLM-powered systems, not just experimented with them. You think in pipelines, obsess over eval design, and know firsthand how much the details matter when putting agents into production.This role sits at the intersection of engineering and applied research. You'll lead the design and implementation of our core AI infrastructure, including RAG pipelines, evaluation frameworks, and multi-agent systems, and help define how we build with LLMs as the technology evolves.We care far more about what you've built than where you built it. Side projects, open-source contributions, and personal systems count just as much as anything on a résumé.You're someone who pushes LLMs to their limits, never satisfied with the first prompt that works, always probing what's actually possible. You thrive in a culture of fast shipping and rapid experimentation, where the goal is to iterate your way to great results rather than design your way to them.What You'll DoDesign, build, and own production-grade RAG pipelines, covering everything from chunking and embedding strategies to retrieval optimization and re-rankingArchitect and implement LLM evaluation frameworks, including automated evals, human-in-the-loop review, and regression testing across model versionsBuild and maintain multi-agent systems using modern agent frameworks, with a focus on reliability, observability, and cost efficiencyOwn LLM observability using Langfuse — tracing, cost monitoring, latency analysis, and quality tracking across model versions and pipeline stagesDefine best practices for prompt engineering, context management, and tool use across the teamEvaluate and integrate new models, frameworks, and tooling as the ecosystem evolvesCollaborate across the full stack, as our AI layer is deeply integrated with a React/NestJS product and an async microservices pipelineMentor other engineers and establish patterns that scale beyond your own workWhat We're Looking ForThe most important thing: a track record of building AI-powered systems. We want to see your work: what you built, why you made the decisions you made, and what you learned. This can come from a job, side projects, an open-source repo, or anything else.8+ years of software engineering experience, with a meaningful portion of that focused on applied AI/ML systems.Beyond that, we're looking for experience with:LLM application development — building real products or systems on top of models like GPT, Claude, Gemini, or open-source equivalentsRAG systems — hands-on work with retrieval pipelines, vector databases (LanceDB, Pinecone, pgvector, etc.), and techniques like hybrid search, re-ranking, and query rewritingEvals — designing and running evaluations for LLM outputs, including building custom eval harnesses and tracking quality over timeLLM observability — production experience with tools like Langfuse, LangSmith, or Braintrust to trace, monitor, and debug LLM pipelines at scaleAgents and agent frameworks — practical experience with multi-step agentic workflows using frameworks like LangGraph, CrewAI, AutoGen, or similar, including production deploymentsMulti-model architectures — routing across models, orchestrating specialized agents, or combining models for different parts of a pipelineAsync/event-driven systems — experience building AI workflows on top of message brokers; we use Azure Service Bus as our async backboneOur StackWe run a modern Azure-based microservices platform. On the product side, the stack is React, TypeScript, and NestJS. The AI pipeline is built on modern LLMs, with LanceDB for retrieval, a pub/sub messaging layer for async workflows, and Langfuse for LLM observability. You don't need to have used every piece of this, but familiarity with the shape of it matters.Nice to HaveExperience with microservice or distributed pipeline architectures, especially event-driven or DAG-based patternsOpen-source contributions or public projects in the AI/ML spaceExperience fine-tuning or adapting models (LoRA, RLHF, DPO)Familiarity with Azure AI services (OpenAI, Document Intelligence, AI Search)Experience working with structured outputs, function calling, and tool useHow to Stand OutIf you don't have GitHub repos you're willing to showcase, this is not the role for you. We're not looking for the most impressive job title. We're looking for someone who finds this stuff genuinely interesting and has the receipts to prove it.