<Back to Search
Software Engineer
Millbrae, CAMarch 31st, 2026
Software EngineerYou'll be hands-on in improving the real-world behavior of our AI systems tracing and fixing runtime issues, building agent simulators, designing LLM evals and QA tools, and interfacing with client data. This is a role for builders who like prompt-level debugging, LLM system testing, and building infrastructure that improves our AI agents' performance.
You'll work across our AI agent platform writing prompts, debugging runtime issues, building agent simulation tooling, creating evals, interfacing with client data, and helping us monitor system behavior at scale. This is not a model training role it's an applied systems position focused on behavior, infrastructure, and debugging real-world agents in production.
You will be working at the forefront of agentic AI, where you'll be pushing the boundaries of our agents' capabilities.
Some examples of what you might work on:
Trace and fix runtime bugs, then write regression tests.
Design evaluation datasets to simulate realistic workflows or red-team our system.
Build internal tooling for QA and agent simulation.
Normalize and transform messy client data for system integration.
Set up automatic testing and latency tracking infrastructure.
Create dashboards and observability tooling for agentic system behavior.
Expand on our existing eval & testing framework and agent simulation infrastructure.
Technical Skills
Proficiency in TypeScript
Strong generalist software engineer
Strong debugging skills. You can trace runtime failures, dig through logs, and pinpoint issues in async or multi-step agent systems.
Data transformation and ingestion. You can build pipelines to normalize and convert unstructured data for use in AI systems.
Strong understanding of system design, including distributed systems and reliability/performance tradeoffs
Experience using modern AI coding tools (e.g. Cursor, GitHub Copilot, Claude)
Excellent documentation and testing discipline
Proficiency with Git
Soft Skills
You care about improving agent behavior. This is an applied systems position focused on behavior, infrastructure, and debugging real-world agents in production. You will be working at the forefront of agentic AI, where you'll be pushing the boundaries of our agents' capabilities.
You're high agency. AKA "agentic" ;) You can thrive with minimal structure. You are internally motivated. You proactively seek out ways to create value for your team.
You don't mind getting in the weeds. Improving agent performance requires diving deep into the details: identifying and understanding real-world edge cases, editing prompts to address them, and writing evals to cover them in the future. Sound exciting? You'll thrive. Sound tedious? You won't.
You're comfortable with ambiguity. You work well when specs are loose, or when the solution space spans prompts, code, and even a little RLHF.
You learn fast and move fast. You can pattern-match from past systems work and adapt to LLM-specific edge cases quickly.
We're looking for engineers with 2-7 years of experience who have worked closely with LLMs or AI agents in production systems. This is not a model R&D role it's about applying AI to real-world use cases: debugging behavior, designing evals, and building the infrastructure to scale intelligent systems.
You might be a strong fit if:
You've created internal tools or frameworks to support QA, evals, or agent simulation, and care about making complex systems observable and testable.
You've contributed to fast-paced product cycles involving AI behavior, latency, and user experience, and you're comfortable validating behavior by inspecting outputs, not just logs.
Nice to have:
Experience with multi-agent systems, TTS/NLP pipelines, or structured output validation.
Familiarity with testing frameworks, LangChain-style agent orchestration, or in-house eval harnesses.
Experience with prompt engineering, LLM evals, and agent orchestration. You're comfortable writing and refining prompts, crafting evals, and reasoning about LLM outputs.
68,359 matching similar jobs at Interstate Moving Relocation Logistics
- Software Engineer (28559)
- Software Engineer
- Software Engineer- Clearance Required
- Software Engineer- Clearance Required
- Sightline Sr. Software Engineer (Sightline Information Technology Department)
- Expert Software Engineer
- Data Transport Software Engineer Sr
- Test Automation Engineer (Sr Programmer Analyst II)
- Mid-level Software Engineer
- Advanced Software Engineer
- Distributed Intelligence - SDK Software Developer
- Sr Software Engineer
- Advanced Software Engineer
- Advanced Software Engineer
- Software Engineer, ARC Team
- Software Engineer Intern, Viasat Government
- Sr. Software Engineer II - Embedded C Build Systems and Tools
- Software Engineer II (Datagrid)
- Mission Software Engineer, Public Sector
- Software Engineer with 25K sign on bonus
- Lead Software Engineer
- Sr. QA Automation Engineer
- Kinetics Software Engineer
- Kinetics Sr Software Engineer
- Mid-Level Software Engineer
- Software Engineer
- Software Engineer, ARC Team
- Software Engineer
- Software Developer SME
- Mid-Level Software Engineer (.NET)
- Software Engineer
- Software Engineer
- Sr. Software Engineer
- Senior Backend Software Engineer (New York City, Los Angeles, or San Francisco)
- Device Automation & QA Manager
- Senior Software Engineer, Managed Orchestration (Managed Kubernetes)
- Senior Software Engineer, Backend — Frontier Data
- Senior Software Engineer - Data Mesh & High-Scale Systems
- Senior Java Backend Engineer — Low-Latency APIs, Remote
- Cloud Software Engineer (100% Remote)