Senior+ Data Scientist - ML & Image Generation
Hapiko Senior+ Data Scientist - ML & Image Generation Brooklyn, NY·Full time Company website Apply for Senior+ Data Scientist - ML & Image GenerationBuild and optimize the ML pipeline behind Stickerbox, an AI-powered voice-to-sticker printer for kids.About HapikoHapiko is a Brooklyn-based company building the future of play.DescriptionFounded by Arun Gupta (former CEO of Grailed, which sold to GOAT Group in 2022) and Bob Whitney (Anthropic, NYT Games), we're on a mission to create safe, hands-on AI experiences that fuel kids' imaginations rather than replace them.Our first product, Stickerbox, is the world's first voice-to-sticker printer. A device that instantly transforms a child's spoken ideas into printable, colorable stickers. We sold out our first run shipping for the holidays, and it's already being called "one of the first products to make AI feel magical for kids and grounded for parents."We have a $7M funding round led by Maveron (backers of Lovevery), Serena Ventures, and Ai2 (The Allen Institute). Stickerbox is bringing imagination to life for kids nationwide!Why are we hiring?The technical challenge is real. We're running real-time audio transcription, proprietary content safety systems, and custom image generation, all serving thousands of concurrent users with sub-second latency. We're training our own models from scratch, optimizing for kid-friendly aesthetics, and building safety guardrails that actually work. We need a Data Scientist to own data quality, evaluation, and ML optimization across this entire pipeline. You'll work with the team to define what to train on, how to measure success, and how to make our models better every day.What you'll doAs our first Data Science hire, you'll collaborate with us on:Build and curate large-scale image datasets for training custom modelsDesign annotation pipelines and data quality processesAnalyze training runs and model outputs to guide iterationWork with our team to define what to train on and how to evaluate itOptimize our transcription pipeline for accuracy and latencyImprove image generation quality, prompt adherence, and consistencyIdentify bottlenecks and failure modes across the pipelineRun experiments and A/B tests to measure improvementsSafety & Content ModerationRefine content safety systems for child-appropriate outputs, and develop new onesBuild on our evaluation datasets for safety edge casesAnalyze moderation performance and reduce false positives/negativesStay current on best practices for AI safety in generative systemsEvaluation & MetricsBuild evaluation frameworks to measure model performance at scaleDefine metrics that correlate with user satisfaction (aesthetic quality, relevance, safety)Develop automated evaluation pipelines (LLM-as-judge, CLIP scores, human eval)Track experiments and communicate findings to the teamPrompt EngineeringOptimize prompts for transcription accuracy and image generation qualityDevelop systematic approaches to prompt testing and iterationBuild prompt templates and guidelines for different use casesWhat we're looking for5+ years in data science or applied MLExperience optimizing production ML systemsStrong statistical and analytical skillsFamiliarity with LLMs and image generation modelsPython proficiency; comfortable with PyTorchExperience building evaluation frameworksTrack record of improving ML system performance through data and experimentationNice to haveExperience with content moderation or trust & safetyBackground in speech/audio ML or computer visionExperience with human annotation pipelines (Label Studio, Scale AI)Familiarity with prompt engineering techniques and LLM-based evaluationLocation: NYC only, On-site (flexible on WFH but we like to be in office the majority of the week) in our Brooklyn based office, close to most major train lines.Salary Range: $150k - $250k base + equity and benefits#J-18808-Ljbffr