Senior Data Engineer (Snowflake & Observability Implementation)
Client: WorkivaJob Title: Senior Data Engineer – Data Observability (Snowflake, dbt, DMF)Location: LATAMDuration: 6monthsRole OverviewWe are looking for a highly skilled Senior Data Engineer with strong experience in Data Observability to help operationalize and scale a robust data reliability framework. This role will focus on implementing end-to-end observability across dbt, Snowflake Data Metric Functions (DMFs), Splunk, and OpsGenie , ensuring proactive detection and resolution of data quality issues across critical data assets.The mission is to establish a self-healing, highly visible data reliability layer that eliminates silent data failures and enables faster incident response.Key Responsibilities1. Unified Metadata Collection & PersistenceStandardize and automate dbt metadata capture across all model runs.Build and maintain a hardened dbt-to-Snowflake logging pipeline to persist run_results.json and manifest.json into an Observability schema.Implement automated cleanup and retention policies to manage storage efficiently.Apply data observability rules at scale by pushing dbt checks into Snowflake DMFs .2. Data Quality & Observability FrameworkImplement and manage observability rules across key dimensions:Validity (data types, formats)Freshness (timeliness, latency)Volume (record count reconciliation)Schema & Values (structural and value changes)Distribution (anomaly detection, trend deviations)Ensure data quality monitoring for high-priority and Tier-1 tables.3. DMF Thresholding & Performance OptimizationDesign and implement targeted Snowflake DMFs instead of blanket monitoring.Define dynamic thresholds (e.g., standard deviation-based) to reduce alert fatigue.Analyze and optimize DMF credit consumption , keeping monitoring costs within 5–10% of total compute.4. Observability & Alerting (Splunk Integration)Build a single pane of glass for data observability using Splunk.Create high-performance alerts correlating dbt job failures with DMF violations .Ensure alerts include contextual payloads such as:Failing dbt model or code linkTable ownerDownstream BI impact5. Incident Management & SLA Enforcement (OpsGenie)Integrate Splunk alerts with OpsGenie for actionable incident management.Configure:Priority-based alert routing (Warning vs Critical)Auto-resolution of alerts when issues self-healSLA tracking for MTTD (Mean Time to Detect) and MTTR (Mean Time to Resolve)6. Reporting & Executive VisibilityBuild a Data Reliability Executive Dashboard (Splunk or Snowflake/Streamlit) to provide:Overall Data Health ScoreVolume and Freshness trendsTop offending models/tablesMonth-over-month MTTD and MTTR improvementsOperational Standards & DocumentationCreate detailed Runbooks / SOPs for on-call engineers.Implement Monitoring-as-Code using version control (Terraform, dbt project files).Maintain a weekly Observability Health Dashboard to identify noisy or unstable models.Required Skills & ExperienceStrong hands-on experience with Snowflake , especially Data Metric Functions (DMFs)Advanced experience with dbt (metadata, testing, orchestration)Proven experience integrating observability tools like SplunkExperience with incident management platforms (OpsGenie preferred)Strong SQL and data modeling skillsExperience building scalable, production-grade data pipelinesFamiliarity with cost optimization and performance tuning in SnowflakeNice to HaveExperience with Streamlit dashboardsInfrastructure-as-Code (Terraform)Background in data governance or data reliability engineeringExperience supporting on-call or production data platforms