JOBSEARCHER

Data Scientist for AI Training and Evaluation

Role Overview Join a groundbreaking initiative to create realistic enterprise environments for training and evaluating frontier AI agents. As a Data Science & Analytics Expert, you will leverage your experience from leading technology, financial services, retail, or healthcare organizations to build high-fidelity digital workspaces that reflect real-world workflows. Your contributions will directly impact how AI systems are developed and assessed. Key Responsibilities Construct a realistic digital workspace based on your daily use of Drive folders, including design documents, experiment write-ups, stakeholder presentations, SQL snippets, notebook exports, model cards, dashboards, and relevant email threads, along with the platforms that support these activities (e.g., Databricks, SAS Studio, Tableau, Power BI, Informatica PowerCenter, Talend). Design multi-step tasks that mirror your actual workflows, requiring navigation through various applications, files, and stakeholders to effectively challenge advanced AI agents. Collaborate with fellow data science and analytics professionals to design the environment, define task scope, and review scenarios for realism and rigor. Work asynchronously with research teams to refine task designs and establish evaluation criteria for data science agent benchmarks. Contribute to cutting-edge AI research and benchmarking, with your work informing how leading labs train and evaluate the next generation of AI systems. Ideal Qualifications BS, MS, or PhD in a quantitative discipline. 3+ years of full-time experience in a Fortune 500 technology, financial services, retail, or healthcare enterprise. Experience in one or more of the following areas: Applied data science (forecasting, causal inference, ML modeling). Analytics engineering or data modeling. Experimentation or A-B testing platforms and methodologies. Product analytics and business insights. ML engineering or decision science/operations research. Proficient in using Databricks, SAS Studio, Tableau, Power BI, and Informatica PowerCenter or Talend in daily operations. Strong analytical thinking and writing skills, with the ability to translate data science workflows into structured task specifications. Compensation Note Task Completion Pay: Competitive, based on task quality (~$1, 250, $1, 750 per completed task, subject to change as the project evolves). Performance Bonus: Top performers receive a weekly bonus incentive in addition to their per task rate. Hourly Opportunity: Top performers may be invited to transition to an hourly compensation model based on sustained quality and throughput.