JOBSEARCHER

Lead Data Engineer – Databricks, Spark & Data Platforms

Lead Data Engineer – AI Data ProductsContract-to-Permanent Hire 100% Remote (8AM-5PM CST)Our Fortune 50 healthcare client’s AI/ML platforms group is seeking a modern Lead Data Engineer to provide technical leadership and delivery oversight across multiple AI data products within their enterprise AI Hub. This role is primarily focused on technical direction, architectural guidance, and team leadership (~75%), while remaining hands-on (~25%) in building scalable data pipelines, CI/CD automation, and AI-enabling data assets across multiple concurrent initiatives.Responsibilities:Provide technical leadership across multiple AI Data Product initiatives and engineering workstreams.Understand and clarify technical requirements, recommend architecture/design elements, and set overall technical direction across projects.Design, implement, and maintain scalable ETL/ELT pipelines and distributed data workflows using Databricks/Spark technologies.Implement and optimize CI/CD pipelines, data operations workflows, and cost management strategies across the data platform.Build and support AI-enabling data assets such as vector stores, feature tables, Genie Rooms, and semantic AI context assets, while ensuring integration into model development workflows.Partner with AI/ML, analytics, platform, and business teams to deliver production-grade data solutions.Support platform visibility by delivering operational insights into platform utilization, cost trends, and financial operations.Oversee and support Junior-Senior Engineers through POCs, technical guidance, troubleshooting, and code reviews.Requirements:Strong hands-on experience with Databricks Data Engineering and Spark distributed computing. Hadoop ecosystem experience is a plus.PySpark and Python expertise for large-scale data processing.Strong SQL skills and experience with data warehouses and data analysis.Hands-on experience building data pipelines (batch and streaming).Experience working with columnar data formats (Parquet, Delta).Experience with DevOps practices, CI/CD pipeline development, and Git workflows (GitHub/GitLab).Familiarity with Linux scripting fundamentals (for pipeline and CI/CD automation).Exposure to emerging AI data infrastructure, such as building vector stores and applying DataOps / MLOps practices.Technical leadership across multiple concurrent projects – providing architectural guidance, defining technical work, setting technical direction, leading code reviews, and mentoring engineers.*US Citizens & Green Card holders only*