Data Engineer
Role: Databricks EngineerLocation: Wilmington, DE (5 Days onsite role)Interview: Online InterviewType: Contract Job Description:OverviewWe are seeking a Data Engineer to lead the modernization of legacy ETL systems by migrating Ab Initio workflows to scalable, modular PySpark pipelines on Databricks. The role involves transforming complex data ecosystems into cloud-native architectures while ensuring data integrity, performance, and reliability.Key ResponsibilitiesETL Modernization & DevelopmentAnalyze and migrate legacy ETL workflows from Ab Initio to PySpark-based pipelinesDesign and develop scalable data pipelines on DatabricksRefactor monolithic processes into modular, reusable componentsLeverage existing enterprise datasets to avoid redundancyData Integration & ProcessingBuild and maintain ETL/ELT pipelines integrating data from Snowflake and other sourcesProcess and publish enriched datasets for downstream applicationsSupport batch and near real-time data processingData Lineage & OptimizationCreate end-to-end data lineage and data flow diagramsIdentify redundancies and drive process consolidation and optimizationEnsure adherence to data governance and quality standardsTesting & ValidationDevelop unit, integration, and reconciliation frameworksPerform dual-run comparisons with legacy systemsValidate outputs in UAT and pre-production environmentsDeployment & OperationsSupport cutover and migration strategy from legacy systemsDecommission legacy workflows and optimize scheduling (e.g., Control-M)Develop runbooks, monitoring, and operational documentationCollaborationWork with data architects, analysts, and downstream application teamsCoordinate user acceptance testing (UAT/FAT) and stakeholder sign-offs