Senior ETL Data Architect with strong experience in AWS, Python, SQL and Healthcare background
Need 15+ years of experienceWe are looking for a Senior Data ETL Architect with strong expertise in AWS-based data engineering, Python, SQL and healthcare claims processing. This role involves designing and managing large-scale, HIPAA-compliant data pipelines handling millions to hundreds of millions of claims records.The candidate will act as a technical leader, working closely with analytics, clinical, compliance, and product teams.Must-Have Skills Healthcare Domain (Mandatory)Strong experience with ANSI X12 EDI transactions: 837P, 837I, 837DKnowledge of full claims lifecycle:835 (ERA), 270/271 (Eligibility), 276/277 (Claim Status)Experience With:ICD-10, CPT, HCPCS, NPI, Revenue CodesUnderstanding of HIPAA 5010 complianceExperience handling large-scale claims data (millions+)AWS & Data EngineeringStrong hands-on with:AWS Glue (PySpark ETL pipelines)Amazon Redshift (data warehousing & performance tuning)Amazon AthenaAmazon S3 & Lake FormationExperience With:Apache Iceberg (schema evolution, partitioning, time travel)Amazon Kinesis (streaming ingestion)AWS Step Functions / LambdaProgramming & ETLStrong in Python / PySparkExperience building ETL/ELT pipelines at scaleHandling multi-format data:EDI, JSON, CSV, XML, APIs, HL7 FHIRDatabases & SQLExpert-level SQL:Joins, CTEs, window functions, query optimizationHands-on experience with:Amazon DynamoDB (GSI/LSI, single-table design)PostgreSQL (partitioning, indexing, stored procedures)Orchestration & DataOpsApache Airflow (MWAA) DAG developmentdbt transformations, testing, modelingCI/CD tools:GitHub Actions / AWS CodePipelineInfrastructure As Code:Terraform or CloudFormationData Governance & ComplianceExperience with:Data quality tools (Great Expectations / AWS Deequ)Data lineage & monitoring (CloudWatch, SNS)Strong Knowledge Of:HIPAA / HITECH complianceEncryption (KMS), IAM access controlKey ResponsibilitiesClaims Data ProcessingProcess and validate EDI 837 transactions at scaleHandle complete claims lifecycle workflowsWork with multi-source healthcare data ingestionETL & Data ArchitectureBuild scalable AWS Glue pipelinesDesign Iceberg-based data lakesOptimize Redshift data warehouse performanceData EngineeringDesign and manage DynamoDB & PostgreSQL systemsOptimize queries for large-scale datasetsOrchestration & AutomationBuild and maintain Airflow DAGsImplement CI/CD pipelines and automationData Quality & GovernanceEnsure data accuracy, lineage, and auditabilityMaintain compliance with healthcare regulationsAdvanced Analytics / MLWork with SageMaker / Redshift MLBuild anomaly detection & duplicate claim detection systemsPreferred / Nice-to-HaveAWS Certification (Data Engineer / Solutions Architect)Experience with:Apache Kafka / Amazon MSKAWS HealthLake / FHIR platformsHEDIS, HCC/RAF modelsData Mesh architecture