AI Agent Engineer
The CompanyOur client is a mobile-first consumer social platform where every post is interactive. Users scroll a feed of playable mini-apps and can create or remix new ones by describing what they want. Creation is powered by a coding agent that turns natural-language prompts into working, shareable experiences.Series-A, backed by top-tier consumer and AI investors. ~20 people, ~10 engineers. Founding team has multiple prior exits.The roleYou'll be the technical anchor for the creation flow — the coding agent that turns "make me a Flappy Bird-style game with my friend as the character" into a working, playable, remixable mini-app. The team is doubling down on creation quality and reliability. You'll run this lane end-to-end and set the technical bar for agent-driven app generation.You'll ownThe agent runtime and orchestration layerLong-horizon agent workflows: prompt → plan → generate → run/validate → repair → publishEvaluation and quality loops — eval harnesses, regression testing, failure taxonomyModel strategy: routing across closed-source and open-source providers (including Chinese open-source models), benchmarking, cost/latency optimizationDebuggability: tracing, metrics, alerts, and observability across agent runsWhat we're looking for4+ years of software engineering experience, including hands-on production work on agentic systems — tool use, orchestration, retry/repair loops, context managementProfessional or native-level Mandarin (required)AI-native — Claude Code, Cursor, or equivalent as your default mode of building. The team is calibrated to this: even the PM ships product work in Claude Code, and the engineering bar starts there.Strong Python or TypeScriptExperience shipping and operating production systemsStrong product instincts; can translate user-experience goals into system design decisionsComfort with modern agent frameworks. Specific framework match isn't required — LangChain, LangGraph, Pydantic AI, Vercel AI SDK, Mastra, Inngest, Temporal, custom stacks all count. What matters is depth on the underlying primitives.BonusBuilt a coding agent, devtool, IDE assistant, or code generation pipeline beforeBuilt an eval framework with release-gating on metricsDesigned MCP-style tool protocols or robust tool interfaces for LLM systemsHave opinionated views on multi-provider model routing (when open-source models beat closed ones, cost/latency tradeoffs in production)Active in the AI agent / open-source ecosystem (LangChain, LangGraph, Pydantic AI, MCP, DSPy, Mastra, Inngest, Temporal, Vercel AI SDK, or similar)H1B transfer sponsored