ML Engineer

TalenerNew York, NYMay 24th, 2026

Computer Systems Engineers/ArchitectsComputer Systems Design and Related Services

Title: ML EngineerLocation: RemoteClient: Global newswire and media organization.Their content reaches more than half the global population daily. The tech org is modern and investing heavily in ML infrastructure to support large-scale media processing across text, image, and video.Role DescriptionThis is a senior hands-on ML engineering role focused on building and optimizing inference systems that run in production at scale. You'd be working across text, image, and video pipelines - processing millions of media assets to power news intelligence products. Think DistilBERT for NER, SBERT for embeddings, TransNetV2 for video shot detection, and external multimodal APIs for captioning.This is not an MLOps or platform role. They need someone who can profile a transformer, rewrite its serving path for a 2-3x latency improvement, tune an HNSW index, and make smart infrastructure decisions on SageMaker instance selection to hit p95 targets at the lowest cost. If your background is primarily Terraform, Kubernetes admin, or CI/CD pipelines - this isn't the right fit.You'll partner closely with MLOps, platform engineering, data scientists, and product teams - but ownership of model performance, inference logic, and pipeline efficiency lives here.Required Skills5+ years building production ML inference systemsPython - core to everything in this rolePyTorch (TorchScript, ONNX, FastAPI/TorchServe) and TensorFlow (SavedModel, tf.data, XLA, TFLite) - both requiredDeep hands-on experience with transformer-based models (BERT family - DistilBERT, SBERT, etc.) in productionInference optimization at scale - quantization, distillation, compilation, kernel/profile-level performance workAWS infrastructure - EC2, Batch, Lambda, SageMaker across different media workload typesHybrid search architecture experience - BM25 + vector search + cross-encoder rerankingAsynchronous processing systems - reliability, caching, deduplication, observabilityData pipeline and workflow orchestration (Airflow or similar)Video frameworks - FFmpeg, large-scale frame-level inferenceMust have experience in the media industryMust have experience working with large amounts of data, including text, images and videosNice to HaveExperience with TransNetV2 or similar video shot boundary detectionFamiliarity with HuggingFace open source LLMsOpenAI API or other foundation model provider experienceHybrid CPU/GPU environment experience at scaleCompensationBase salary up to 150,000.00 + 15% bonus targetFor additional information or to apply, please contact Bethany Moulthrop at bmoulthrop@talener.com

ML Engineer

matching similar jobs near New York, NY