AI/ML Engineer – Text Chunking & Embedding Benchmarking --Remote -- Fulltime
ARCHIVED
We can't find an active application page for this role right now. It may reopen or be listed elsewhere. Use Next Steps to search for an active apply link and similar live jobs.
Dice is the leading career destination for tech experts at every stage of their careers. Our client, Tech Mahindra (Americas) Inc., is seeking the following. Apply via Dice today!AI/ML Engineer – Text Chunking & Embedding Benchmarking --Remote FulltimeJob Description Required 8+ years in AI/ML Engineer to benchmark and optimize text chunking and embedding strategies for large-scale clinical data (up to 1.5B documents). The role focuses on building scalable pipelines, evaluating models, and delivering recommendations aligned with PHI compliance and high-throughput requirements. Key Responsibilities Evaluate embedding models and chunking strategies on clinical data Benchmark across cost, performance, scalability, and quality Design high-throughput embedding pipelines Deliver comparison reports and recommendations for large-scale deployment Build a reusable evaluation framework for future models Ensure compliance with PHI/HIPAA and governance standards Required Skills & Experience Strong experience in AI/ML, NLP, and embeddings Hands-on with OpenAI, Cohere, SBERT or similar models Experience with large document chunking (>200 pages) Knowledge of distributed systems and cloud (Azure/AWS/Google Cloud Platform) Understanding of healthcare data compliance (PHI/HIPAA) Preferred Experience with large‑scale embedding pipelines (billions of records). Prior work on healthcare data platforms. Understanding of audit logging frameworks and lineage tracking. Ability to collaborate with cross‑functional teams in regulated environments.