Senior Software Engineer II - Applied AI and Evaluations (Remote Eligible)

RemotehunterRemoteApril 22nd, 2026

Software DevelopersComputer Systems Design and Related Services

1. About Our Client:The organization operates in the AI-powered work management industry, focusing on advancing intelligent agent platforms to enhance team productivity. It addresses the challenge of scaling AI agents from early prototypes to production-ready systems, emphasizing quality as a critical factor. The program develops solutions that automate manual tasks, uncover insights, and support scalable work management, impacting teams seeking smarter workflows. Their SmartAssist platform represents the next generation of AI-driven work management, targeting real-world application and continuous improvement at scale.2. About the Opportunity:The Senior Software Engineer II - Applied AI and Evaluations role is centered on owning and improving the quality of AI agents within the SmartAssist platform. This highly technical position involves diagnosing agent failures, designing evaluation systems, and driving measurable quality improvements for orchestrators and subagents. The role is key to ensuring these AI agents meet high standards across multiple quality dimensions and directly impacts the platform’s reliability and effectiveness. Collaboration with engineering and AI platform teams is essential to establish scalable methodologies and integrate quality assurance deeply into the development lifecycle.3. Responsibilities:• Own end-to-end agent quality including diagnosis, improvement, and validation• Identify and prioritize failure modes in factual accuracy, completeness, tone, actionability, and latency• Improve quality through prompt engineering, context engineering, and retrieval-augmented generation tuning• Expand and mature the evaluation framework with scorers, datasets, regression gates, and production traffic evaluation• Ensure every change has a measurable, attributable quality signal• Collaborate with architecture leads to differentiate between prompt/context and structural quality issues• Develop repeatable methodologies scalable across agents and subagents4. Requirements:• 8+ years of software engineering experience including 2+ years with production LLMs• Hands-on expertise in prompt and context engineering affecting model behavior• Strong knowledge of retrieval-augmented generation architectures, embedding models, and failure diagnosis• Experience creating or extending LLM evaluation frameworks, including scorers and golden datasets• Familiarity with agent system design and ability to participate in architectural decisions affecting quality• Proficient in Python, comfortable with data-heavy environments such as Databricks or Delta tables• Effective communication skills for conveying complex quality issues to diverse stakeholders• Strong cross-functional judgment and ability to build credibility across teams• Ability to bring clarity and structure in ambiguous situations• Legal eligibility to work in the U.S. on an ongoing basis• BS or MS in Computer Science, related field, or equivalent experiencePreferred:• Experience with MLflow or similar experiment tracking tools• Knowledge of CI-integrated evaluation pipelines• Experience with multi-agent orchestration frameworks• Background in Applied AI or LLMOps within a product company5. Pay Range and Compensation Package:• The pay range and compensation package for this role will be determined based on the candidate’s experience, skills, and other relevant factors.6. Benefits & Perks:• Employer subsidized medical, vision, and dental coverage for full-time employees• 401k match of 50% on contributions up to 6% of eligible pay• Monthly stipend to support work and productivity• Flexible Time Away Program and Sick Time Off• Life insurance, short-term, and long-term disability plans for U.S. employees• 12 paid holidays annually for U.S. employees• Up to 24 weeks of Parental Leave• Personal paid Volunteer Day• Professional growth opportunities and access to Udemy online courses• Company-funded perks including counseling membership and local discounts• Teleworking options from any registered U.S. location (role-specific)Equal Opportunity Statement: Our client is an equal opportunity employer. They celebrate diversity and are committed to creating an inclusive environment for all employees. All qualified applicants will receive consideration for employment without regard to race, color, religion, gender, gender identity or expression, sexual orientation, or national origin.Note:RemoteHunter is not the Employer of Record (EOR) for this role. Our purpose in this opportunity is to connect exceptional candidates with leading employers. We help job seekers worldwide discover roles that match their goals and guide them to complete their full application directly through the hiring company’s career page or ATS.

Senior Software Engineer II - Applied AI and Evaluations (Remote Eligible)

Showing 10,000+ matching similar jobs