JOBSEARCHER

Lead Fabric Data Engineer @ remote work

DigleRemoteMay 18th, 2026
Lead Fabric Data Engineer Remote workDuration: 12 Months+ ResponsibilitiesDesign, develop, test, and maintain PySpark notebooks to ingest data from:SharePoint (REST/Graph API)Outlook (Graph/REST API)3rd party REST APIsBronze layer (raw): persist ingestion in Delta Lake/Parquet with minimal transformations, preserving schema drift and auditing metadataSilver layer (curated): apply data quality checks, deduplication, normalization, standardization, and business-key alignmentImplement and maintain a star-schema data model (facts and dimensions) for analyticsBuild and maintain CI/CD for data pipelines (Git, unit tests, deployment to Databricks/Azure Synapse, etc.)Implement monitoring, alerting, retries, and idempotent ingest strategiesData governance: lineage, masking of PII/PII-sensitive fields, role-based accessDocumentation: pipeline design docs, data dictionaries, and runbooksCollaborate with Data Analysts, Data Scientists, and Business Stakeholders to translate requirements into scalable pipelinesOptimize performance (partitioning, caching, Delta tables, spark configurations) and control costsDeliverablesPySpark notebooks for ingestion from all sourcesBronze layer datasets (RAW) in Delta/ParquetSilver layer datasets (CURATED) in Delta/ParquetStar schema data model: fact and dimension tablesData quality checks, metrics, and dashboardsSchema definitions, data dictionaries, and runbooksTechnologies & ToolsPySpark / Spark SQLDelta Lake or Parquet-based storageDatabricks or Azure Synapse Spark environmentsAzure Data Lake Storage Gen2 (or equivalent) for storageData orchestration: Airflow, Prefect, Dagster, or equivalentREST APIs, especially Microsoft Graph (SharePoint, Outlook)SQL for data modeling and queriesVersion control with GitBasic data governance and security practicesRequired QualificationsBachelor's or Master’s in Computer Science, Data Engineering, or related field10+ years of experience as a Data EngineerProficient in Python and PySpark; strong SQL skillsExperience ingesting data from REST APIs (SharePoint/Graph API, Outlook/Exchange, 3rd party APIs)Demonstrated experience with Bronze/Silver/Gold data modeling, and star schema designExperience with Delta Lake / data lake architecturesFamiliarity with data quality frameworks and profilingExperience with cloud data platforms (Azure preferred: Databricks, ADLS Gen2, Synapse)Strong problem-solving, collaboration, and communication skillsNice-to-HaveDatabricks certification or Azure Data Engineer AssociateExperience with Graph API authentication (OAuth2), service principalsData governance tools and data masking conceptsExperience with streaming/batch hybrid ETL patternsKnowledge of healthcare/finance data domain or other regulated dataSoft SkillsClear communication and ability to explain technical concepts to non-technical stakeholdersProactive, ownership-driven, and able to work in cross-functional teamsStrong documentation and testing mindset