JOBSEARCHER

Senior Software Engineer, Research

Base Pay Range $175,000.00/yr - $220,000.00/yr About Us At Sully.ai, we’re building the most impactful healthcare company on earth. We believe that access to a great doctor is a basic human right. Today, that’s not a reality. Delays, misdiagnoses, administrative chaos, and burnout plague the system. Mission One Human, One Doctor. We build AI teammates that augment clinicians—scribes, nurses, receptionists, translators—all powered by our own world‑class models and deployed in real‑world care. Traction 450+ organizations signed 16 months AI agents cut admin by ~2.8 hours daily and reduce onboarding 85% 5M+ omkring clinical tasks completed to date, serving 36+ specialties Raised $25M from YC, Eric Yuen, Amity, Semper Virens Patented AI architecture (MedCon-1) outperforms GPT‑4.5, Gemini, Claude on clinical reasoning tasks Requirements Sully requires A‑players capable of 4 months = 1 year output. What You’ll Do Build and optimize core research infrastructure: evaluation pipelines, agent workflows, hallucination detectors, coding benchmarks, and research‑to‑production integrations. Design, implement, and scale agentic systems across backend, frontend, and model integrations, collaborating closely with research and co‑founders. Own reliability, observability, and performance across agents (logging, tracing, instrumentation, safety checks). Ship research‑proven features into production within 7 days Tung endunnels. Develop shared tools, SDKs, and internal products that accelerate iteration across Research, QA, and Engineering. What You Must Bring Senior‑level full‑stack engineering experience in React, TypeScript, and Node.js. Proven ability to design, ship, and scale LLM‑powered applications. Expertise in API design, streaming, and CI/CD pipelines. Strong cloud infrastructure background (AWS, GCP, or Azure). Track record of building reliable systems with measurable performance and error budgets. First‑Month Focus Audit all cross‑agent flows for UI/UX consistency, correctness, and performance gaps. Implement shared components, typed schemas, and contract‑driven interfaces for reliability. Establishantages instrumentation for frontend performance, agent consistency, latency, and model round‑trip tracing. Improve or replace brittleHumidity evaluation or agent pipelines identified during onboarding. Partner with Research to productionize at least one new capability. 90 Day OKRs Deliver production‑grade agentic workflows with