Data Engineer
Job DescriptionKey ResponsibilitiesData Engineering & Pipeline DevelopmentDesign, develop, and maintain end-to-end data pipelines in Databricks using Spark and Delta LakeBuild and optimize ELT/ETL processes for structured and unstructured data ingestion into the Data LakehouseImplement scalable ingestion patterns (batch and event-driven) from internal systems, third-party APIs, and cloud sourcesDevelop data models (bronze, silver, gold layers) to support enterprise reporting, analytics, and downstream consumptionData Platform & Integration Integrate the Data Lakehouse with enterprise tools such as Tableau, Alteryx, and machine learning platformsDesign and implement data access controls, identity management, and secure data sharing mechanismsSupport API-based integrations and downstream data consumption patternsData Quality, Governance & Controls Implement data quality checks, reconciliation processes, and monitoring within Databricks pipelinesEnsure adherence to enterprise data governance standards, including lineage, metadata, and audit requirementsSupport regulatory and compliance requirements (e.g., data integrity, privacy, and security controls)Cloud & Automation Develop and manage workflows using orchestration tools (e.g., Airflow, Control-M)Automate data pipelines, deployments, and operational processes through CI/CD pipelinesLeverage cloud-native services (AWS/Azure) for data processing, storage, and event-driven architecturesOperations & SupportMonitor, troubleshoot, and optimize data pipelines and Spark workloads for performance and reliabilitySupport production data platforms, including incident resolution and root cause analysisEnsure high availability, data integrity, and SLA adherence across enterprise data systemsCollaborationPartner with data architects, data scientists, BI teams, and business stakeholders to deliver data solutionsParticipate in Agile ceremonies and contribute to iterative delivery of data productsTranslate business requirements into scalable technical data solutionsRequired Qualifications3+ years of experience in data engineering, data platforms, or related rolesHands-on experience with Databricks, Apache Spark (PySpark), and Delta LakeStrong SQL and data modeling skills (relational and dimensional)Experience building and supporting data pipelines in a cloud environment (AWS or Azure)Experience with ELT/ETL tools (e.g., Fivetran, custom ingestion frameworks)Familiarity with data orchestration tools (Airflow, Control-M)Experience working in Agile development environmentsExperience in financial services or regulated environments (e.g., banking, risk, regulatory reporting)Knowledge of data governance frameworks and tools (e.g., Collibra)Experience with real-time or streaming data pipelinesExposure to machine learning pipelines and feature engineering in DatabricksCloud certifications (AWS, Azure, or Databricks)Technical SkillsDatabricks (Lakehouse architecture, notebooks, jobs, Unity Catalog)Spark / PySparkSQL (advanced querying and optimization)Required Skills: PySpark, SQL, Databricks, Financial