Research Engineer
Research Engineer, Foundation ModelsAbout the OpportunityWe are seeking a Research Engineer to help advance the next generation of large-scale AI systems. This role sits at the intersection of research and engineering, focusing on the development, training, evaluation, and deployment of state-of-the-art machine learning models.You will work across the full model lifecycle, from building large-scale datasets and training infrastructure to experimenting with new model architectures and inference techniques. This is an opportunity to contribute directly to cutting-edge work in large language models, reinforcement learning, long-context systems, and scalable AI infrastructure.ResponsibilitiesDevelop and optimize training, evaluation, and deployment pipelines for large-scale AI modelsImprove inference efficiency, latency, and throughput across advanced model architecturesDesign and maintain research and production frameworks used for model developmentTrain and scale foundation models across large distributed GPU environmentsBuild and manage large-scale data processing, collection, and curation pipelinesCreate high-quality datasets to improve model performance and targeted capabilitiesResearch, prototype, and benchmark novel model architectures and training approachesContribute to experimentation in areas such as reinforcement learning, long-context modeling, reasoning systems, and inference optimizationCollaborate closely with researchers and engineers to transition ideas from experimentation to productionQualificationsRequiredStrong software engineering and systems development experienceDeep understanding of modern machine learning and deep learning techniquesExperience training, fine-tuning, or evaluating large language modelsFamiliarity with distributed computing and large-scale infrastructureExperience building and maintaining data pipelines and ETL workflowsAbility to design experiments, analyze results, and iterate on research directionsStrong problem-solving skills and a research-oriented mindsetPreferredExperience working with large GPU clusters and distributed training frameworksBackground in model optimization, inference systems, or AI infrastructureContributions to machine learning research, open-source projects, or published workExperience with reinforcement learning, long-context models, or large-scale data systemsWhat We ValueOwnership and accountabilityStrong collaboration and communication skillsBias toward execution and practical problem-solvingIntellectual curiosity and continuous learningHigh standards for technical excellence and product qualityAbility to thrive in fast-moving, high-impact environmentsCompensation & BenefitsCompetitive base salary and equity packageComprehensive medical, dental, and vision coverage401(k) program with employer matchingFlexible paid time off policyRelocation assistance and visa sponsorship, where applicableOpportunity to work alongside a highly talented and mission-driven teamAccess to cutting-edge infrastructure and research resourcesKeywords:Machine Learning, Artificial Intelligence, Deep Learning, Large Language Models, LLMs, Foundation Models, Generative AI, Applied AI, AI Research, Research Engineering, Model Training, Distributed Training, Pretraining, Fine-Tuning, Post-Training, Reinforcement Learning, RLHF, Reinforcement Learning from Human Feedback, Inference Optimization, Model Serving, Model Evaluation, Long Context Models, Reasoning Models, AI Infrastructure, GPU Clusters, High Performance Computing, HPC, Distributed Systems, CUDA, PyTorch, JAX, TensorFlow, Neural Networks, Transformer Models, Retrieval Augmented Generation, RAG, Synthetic Data, Data Engineering, Data Pipelines, ETL, Data Processing, Web Crawling, Data Collection, Feature Engineering, MLOps, ML Systems, Scalable Systems, Parallel Computing, Model Architecture Design, Experimentation, Research Scientists, Research Engineers, Software Engineering, Backend Engineering, Performance Optimization, Production ML, AI Agents, Agentic AI, Autonomous Systems, Prompt Engineering, Multi-Agent Systems, Vector Databases, Embeddings, Quantization, Model Compression, Infrastructure Engineering, Cloud Computing, Kubernetes, Python, C++, Open Source AI, Frontier Models, Applied Research, Statistical Learning, Computer Science, Algorithms, Large Scale Computing, Model Alignment, AI Safety, Training Infrastructure, Compute Optimization, Inference Systems, Foundation Model Research, Machine Learning Infrastructure, AI Platform Engineering, Systems Engineering, Data Infrastructure, Production Systems, Scalable AI Systems, Research & Development, Advanced AI Systems, Emerging Technologies, Distributed Computing, GPU Optimization, AI Product Development,