JOBSEARCHER

Engineering Lead - QA Systems

AgiMillbrae, CAMay 29th, 2026
Think Different. Build the Future. 🚀Our MissionBuild everyday AGI. Trustworthy, consumer-grade agents that redefine human–AI collaboration for millions. Software shouldn’t wait for commands; it should partner with you, amplifying what you can do every single day.Why AGI, Inc.We’re a stealth team of elite founders and AI researchers, with backgrounds spanning Stanford, OpenAI, and DeepMind. We’re industry leaders in mobile and computer-use agents, bringing these capabilities to consumer scale.Grounded in years of agent research, our AI is designed with trustworthiness and reliability as core pillars, not afterthoughts.We are supported by tier-1 investors who funded the first generation of AI giants; now they’re backing us to build the next: everyday AGI. (Watch the demo)If you see possibility where others see limits, read on.You'll own quality for an AI product that is non-deterministic, runs on hardware you don't control, and ships into partner builds with hard launch dates. This is for the engineer who finds existential satisfaction in catching the bug before a user does — and the partner exec finds out from your dashboard, not from their inbox.🤩 Tasks you will ownThe testing systems that gate every release — automated agent test suites, on-device regression harnesses, model version compatibility matrices, and the device farm that runs themThe bug pipeline — triage, repro, root-cause, and the post-mortems that keep the same bug from shipping twiceThe dashboards and SLAs that tell the team, in real time, whether what we shipped yesterday still works today🤚 Areas where you will assistResearch, on what to test about model behaviorProduct engineers, on what to test about agent reliabilityForward-deployed engineers, on what partners actually care about in their environment📚 Skills you'll be expected to teachHow to test a system that gives a different answer every timeHow to build test infrastructure that scales from one shipped device to millions🧑‍🎓 Skills you'll be expected to learnEval drift, locale-specific failures, hardware-class regressions, and the rest of the long tail of QA-ing AI in productionWhat shipping consumer AI at OEM scale actually requiresReliable agentic systems from the people who published the canonical papers on it🏆 Timeline of successAfter 30 days — You've audited every test we run today and produced a sharp doc on what's automated, what's manual, and what's nothing at all. You've stood up at least one piece of regression coverage that should have existed already.After 60 days — You've shipped a real testing system — automated agent regressions, an on-device test farm, or a partner build verification harness — that the team relies on. Bug triage runs on rails you set up.After 90 days — Your systems have caught real regressions before they shipped. Engineers across research, product, and FDE write code differently because of the harness you built. You're shaping the next quarter's quality roadmap.💰 Compensation:Competitive cash and meaningful equity. Top-tier relocation and immigration support. SF, in person.