Data Engineer - Hybrid
Occupations:
Data Warehousing SpecialistsDatabase ArchitectsComputer Systems Engineers/ArchitectsSoftware DevelopersData ScientistsIndustries:
Web Search Portals, Libraries, Archives, and Other Information ServicesComputing Infrastructure Providers, Data Processing, Web Hosting, and Related ServicesEducational Support ServicesMedia Streaming Distribution Services, Social Networks, and Other Media Networks and Content ProvidersBusiness Support ServicesData Engineer - Hybrid OpportunityThis is a hybrid position based at our corporate office in Brentwood, TN, with on-site work required Monday through Wednesday.This role requires a highly technical Data Engineer with expert-level proficiency in Azure Databricks, distributed data pipelines, and large-scale healthcare data processing. This role focuses on designing and implementing high-throughput ingestion pipelines, transactional lakehouse layers, and secure PHI data flows using Azure-native services and Databricks runtime optimizations.You will build and operate production-grade data pipelines that meet rigorous requirements for security, lineage, compliance (HIPAA), observability, and operational SLAs, supporting analytics, AI, and clinical insights across the organization.Core ResponsibilitiesPlatform & ArchitectureArchitect and implement scalable data processing pipelines using:Databricks Runtime (Apache Spark, Spark SQL, MLflow, Delta Lake)Delta Lake ACID transactions, Z-Ordering, OPTIMIZE, and Change Data Feed (CDF)Unity Catalog for governance, lineage, RBAC, and audit controlsDesign and enforce a medallion (Bronze/Silver/Gold) architecture with schema evolution, Delta Live Tables (DLT), and robust error-handling patternsBuild high-performance ingestion frameworks for:FHIR and HL7 message streamsX12 837/835 healthcare claims dataEHR/EMR source systemsBatch, real-time, and event-driven data sourcesAzure Cloud EngineeringDevelop and operate data pipelines leveraging:Azure Data Lake Storage Gen2 (hierarchical namespace, ACLs, POSIX permissions)Azure Data Factory or Synapse Pipelines (parameterization, dynamic pipelines, triggers)Azure Event Hubs and/or Service Bus for streaming ingestionAzure SQL Database and Azure Synapse (Dedicated and Serverless pools)Azure Functions for lightweight orchestration and automationAzure Monitor, Log Analytics, and Application Insights for observabilityImplement enterprise-grade security including:VNet integration and private endpointsSecrets and key management using Azure Key VaultManaged identities and least-privilege access controlsDistributed Data EngineeringDevelop optimized PySpark and/or Scala pipelines using advanced Spark techniques:Catalyst optimizer tuningCluster sizing and autoscaling strategiesAdaptive Query Execution (AQE)Efficient join strategies (broadcast vs. shuffle)Build and maintain:High-volume batch ETL pipelines (100M+ records)Low-latency streaming pipelines using Spark Structured StreamingImplement CI/CD for Databricks environments, including:Git-integrated DEV/QA/PROD workspacesAutomated job and workflow deploymentsUnit testing using pytest and Databricks testing frameworksHealthcare Data & ComplianceDesign and implement secure PHI pipelines compliant with:HIPAA Privacy and Security RulesSOC 2 and HITRUST-aligned controlsBuild pipelines supporting healthcare data standards including:FHIR R4 resources (Patient, Encounter, Observation, Claim, etc.)HL7 v2.x messages (ADT, ORU, ORM)X12 EDI transactions (837, 835, 270/271)Ensure end-to-end lineage tracking, auditability, and data retention across all lakehouse layersRequired Qualifications5+ years of experience in modern data engineering rolesExpert-level proficiency in:PySpark and Spark SQLDatabricks (Jobs, Workflows, Repos, Delta Live Tables)Delta Lake architecture and transactional design patternsAzure Data Factory or Azure Synapse PipelinesCloud-native data security (RBAC, ABAC, privilege boundary enforcement)Strong experience working with healthcare data formats and standards:FHIR (JSON)HL7 v2/v3X12 EDI claims dataDeep understanding of distributed systems, data partitioning strategies, concurrency, and cluster resource tuningPreferred QualificationsExperience implementing Unity Catalog at enterprise scaleFamiliarity with MLOps workflows and Databricks MLflowExperience using dbt with Databricks SQLRelevant certifications, including:Databricks Data Engineer ProfessionalMicrosoft Azure DP-203HL7 or FHIR certification (nice to have)BenefitsComprehensive health, dental, and vision insuranceHealth Savings Account with an employer contributionLife Insurance PTO401(k) retirement plan with a company matchAnd more! ENVIRONMENTAL/WORKING CONDITIONS: Normal busy office environment with much telephone work. Possible long hours as needed. The description is intended to provide only basic guidelines for meeting job requirements. Responsibilities, knowledge, skills, abilities and working conditions may change as needs evolve.If you are viewing this role on a job board such as Indeed.com or LinkedIn, please know that pay bands are auto assigned and may not reflect the true pay band within the organization.No Recruiters Please