JOBSEARCHER

Machine Learning Engineer

D24 SearchNy, WALApril 24th, 2026
Machine Learning EngineerNYC (Brooklyn) – onsiteUp to $300K + EquityMy client is a VC backed start up building the data layer for real-world AI training. We work with frontier labs to turn messy, multi-modal enterprise data into the highest-quality training data on the market — sourced from the hundreds of venture-backed startups we help wind down.We're a fast-growing team based in-person in Dumbo, Brooklyn. Backed by Floodgate, Afore Capital, Hustle Fund, and incredible entrepreneurs.The RoleAs an ML Engineer, you'll take the cleaned, resolved data coming out of our pipeline and figure out what to build with it. The raw material is unique — real codebases, tickets, messages, docs, and decisions from real companies, with every linkage preserved. The open question is how to turn that into the most valuable training data on the market. You'll have wide latitude and direct access to the CEO and CTO on direction. All of this happens on deeply sensitive data, so everything we build is designed with security and privacy at the core.Requirements:3 - 8 years of experience in applied machine learning, with work training or fine- tuning modelsExperience training Machine Learning models with less defined or abstract data sourcesExperience with training data curation and evaluationsComing from an RL environment, data labeling, or data- for- AI companiesWorked at an early- stage startup (sub- 50 people)BS+ in CS, ML, or related quantitative fieldStrong Python and other relevant ML librariesExpereince developing or fine- tuning transformer based models for Applied AI (huge bonus)Overlap with sensitive data processing: NER, NLP, entity resolutionHigh agency and comfort with ambiguity; would rather pick the right problem than be handed oneWhat You'll Work OnYou'll own problems end-to-end. Some examples of what you might tackle in your first 90 days:Extracting realistic, verifiable agent tasks from linked repos, tickets, and PRsBuilding environments from real company snapshots where the reward signal comes from how work actually got doneAugmenting datasets with synthetic variants without losing the realism that makes them valuableRunning experiments to understand which enrichments actually move the needle for the labs buying from us.You Might Be a Fit IfYou've trained or fine-tuned models and shipped applied ML workYou're creative and high-agency — you'd rather pick the right problem than be handed oneYou're excited about applied work with real dataAI is deeply integrated into your workflow and lifeWhy candidates should joinRecently started our data business which is already doing close to 5x our entire revenue from last year. We're acquiring and licenses data from venture-backed startups that are winding down, then cleans and enriches that multimodal data (email, Slack, code, images, video, financial records — a company's entire digital footprint).Working with a majority of the large AI Labs and data layer companies.Projecting 10x on last years ARRHave raised $5.5M from Floodgate, Afore Capital, Hustle Fund, and notable entrepreneurs.Will have a lot of freedom to help us decide what data to utilize and how to effectively shape it so that it's in a trainable format for labs and other data providers.