Data Architect Principal Engineer / Architect
Hi,Send your resume to Dilip@cloudingest.comTitle: Data Architect Principal Engineer / Architect Department: Data Centre of Excellence (Data COE) Location: Seattle, Washington JOB: Full TimeExperience: 12–18 Years (Principal IC) Client: Direct Role OverviewWe are looking for a Principal Data Architect to define and own the architectural direction of our enterprise data platform. Operating at the intersection of data engineering, AI/ML infrastructure, and cloud platform strategy, you will set the technical vision that underpins how data is ingested, stored, served, and consumed across a federated multi-cloud environment. This is an individual contributor role at the principal level — you are expected to shape the architecture, influence engineering decisions across domains, and drive the platform from its current state toward a modern, governed, AI-ready data estate.Key ResponsibilitiesPlatform ArchitectureDefine the enterprise data platform reference architecture spanning Azure, AWS, and GCP — covering ingestion, storage, transformation, serving, and real-time streaming layersOwn architectural decisions for Databricks / Delta Lake as the core lakehouse platform; define Delta sharing, Unity Catalog governance, and multi-cluster design patternsDesign and govern cloud infrastructure patterns using Terraform; establish IaC standards for all platform assets across cloud providersLead Kubernetes (K8s) architecture for containerised data platform services — including operator patterns, resource management, and workload isolation for data and ML workloadsArchitect Snowflake environments across multi-cloud deployments — data modelling, performance tuning, role-based access, dynamic data masking, zero-copy cloning, and cross-cloud data sharing via Snowflake Data Clean Rooms and MarketplaceDefine API gateway strategy for data product serving — access control, rate limiting, versioning, and consumer onboarding patternsStreaming & OrchestrationArchitect real-time event streaming infrastructure on Apache Kafka — topic design, schema evolution, partitioning strategy, and cross-domain event mesh patternsSet orchestration standards using Apache Airflow — DAG design patterns, dependency management, and operational reliability at scaleAI / ML Data InfrastructureDesign and own the ML data infrastructure layer — feature store (Feast), model experiment tracking (MLflow), and data versioning practices for reproducible MLArchitect vector database infrastructure using Pinecone and/or FAISS for retrieval-augmented generation (RAG) and semantic search use casesDefine data contracts and feature serving patterns that bridge the platform engineering and ML engineering domainsObservability & ReliabilityDefine platform observability standards using Prometheus and Grafana — pipeline SLIs/SLOs, data quality alerting, and infrastructure health dashboardsDrive platform reliability engineering practices — incident response, chaos engineering, and capacity planning across the data estateLeadership & GovernanceAct as technical authority for data architecture decisions; participate in architecture review boards and provide design guidance to domain engineering teamsProduce and maintain Architecture Decision Records (ADRs) and reference architectures for platform-wide consumptionMentor senior engineers; raise the architectural bar across the Data COE through design reviews, standards documentation, and community of practice leadershipMust-Have SkillsMulti-cloud architecture — Azure, AWS, and GCP; deep expertise in at least two, working knowledge of the thirdSnowflake — expert-level: multi-cloud deployment, data modelling, performance tuning, RBAC, dynamic data masking, zero-copy cloning, Data Sharing, and Snowpark for Python/Java workloadsDatabricks / Delta Lake — production-scale lakehouse design, Unity Catalog, Delta sharing, cluster architecture, and Databricks WorkflowsApache Spark — distributed processing at scale; performance tuning, resource optimisation, and advanced transformation patternsApache Kafka — enterprise streaming architecture; topic design, schema registries, consumer group management, and cross-cloud event meshApache Airflow — production-grade orchestration; DAG design patterns, plugin development, and reliability at scaleKubernetes (K8s) — container orchestration for data platform services; operators, resource quotas, and workload isolationTerraform — IaC for multi-cloud data infrastructure; module design, state management, and CI/CD integrationMLflow — ML experiment tracking, model registry, and lifecycle managementFeast — feature store design, offline/online serving architecture, and ML pipeline integrationVector databases — Pinecone and/or FAISS; embedding infrastructure for RAG and semantic search use casesAPI gateways — data product API design, access control, and consumer management patternsPrometheus / Grafana — observability stack design; metrics collection, alerting rules, and dashboard standards12+ years of data engineering or architecture experience, with 4+ years operating at a principal or staff IC levelThanks,Dilip kumar