Software Engineer - AI Evaluation
A company is looking for a Software Engineer - AI Evaluation Expert.
Key Responsibilities
Evaluate the performance of frontier language models on complex software engineering tasks
Identify bugs, logical errors, hallucinations, and reliability issues in model outputs
Design and review prompts, test cases, and evaluation scenarios for advanced coding workflows
Required Qualifications
3-4+ years of professional software engineering experience
Strong proficiency in at least one of: TypeScript, Ruby, Java, or C++
Familiarity with modern AI / LLM tooling (Git, CLI workflows, testing frameworks, etc.)
Ability to critically evaluate model behavior rather than simply use model outputs