Lead Data Engineer
ARCHIVED
We can't find an active application page for this role right now. It may reopen or be listed elsewhere. Use Next Steps to search for an active apply link and similar live jobs.
We are seeking a Senior Data Engineer (open to Principal level) to lead the modernization and ownership of a critical data pipeline within a large-scale healthcare analytics environment. This role will focus on transitioning legacy SAS-based pipelines to modern Python/PySpark on Databricks, while driving engineering best practices and scalable data solutions.This is a hands-on engineering role with a strong emphasis on development capabilities. Candidates with application development experience and exposure to AI/automation technologies will stand out.Key ResponsibilitiesLead the modernization of data pipelines from SAS to Python/PySpark on DatabricksOwn and evolve a mission-critical HEDIS data pipeline used for performance measurement and reportingDesign, build, and optimize scalable data pipelines in a distributed environmentCollaborate with SMEs during an initial knowledge transfer period, with eventual full pipeline ownershipDevelop, schedule, and automate end-to-end data workflowsEnsure data quality, reliability, and performance across large datasetsPartner with cross-functional teams and analytics vendors to deliver high-quality data outputsContribute to best practices in version control, CI/CD, and agile development workflowsRequired QualificationsStrong development/engineering background (core requirement)Hands-on experience with Python (scripting and application development)Expertise in building and managing data pipelines and ETL workflowsExperience processing large-scale datasets in distributed environmentsProficiency with Databricks (notebooks, workflows, cluster management)Solid experience with AWS services including S3, Lambda, Glue, and EC2Strong SQL skills for complex transformations and data extractionExperience with pipeline orchestration and automationFamiliarity with version control systems (Git) in a collaborative environmentExperience managing work via issues, epics, and agile toolingPreferred / Nice-to-HaveExperience with AI, machine learning, or automation frameworksExposure to healthcare data (e.g., HEDIS)Background in transitioning legacy systems to modern data platformsWhat are we Looking For (Priority Order)Strong development engineering capabilities (must-have)Application development experience, especially Python scriptingExpertise in AI or automation (highly desirable bonus)