{"schemaVersion":"jobsearcher.job.v1","id":"a524e9fabd4abb93fbc2e54a","url":"https://jobsearcher.com/jobs/a524e9fabd4abb93fbc2e54a","canonicalUrl":"https://jobsearcher.com/jobs/a524e9fabd4abb93fbc2e54a","title":"Senior Data Engineer INDIA","description":"Senior Data Engineer\r\nRemote Work: INDIA *Only Consultants local to INDIA are eligible.\r\n*No visa Sponsorship\r\nPrimary Responsibilities\r\nDesign, develop, and maintain scalable data pipelines using Python, PySpark, and other modern programming languages to support both batch and streaming workloads\r\nBuild and optimize data processing frameworks on cloud platforms such as Databricks or Snowflake, ensuring performance, reliability, and cost efficiency\r\nDesign and implement robust data models, including transactional (OLTP) and dimensional (OLAP) schemas, to support analytics, reporting, and application integration\r\nDevelop high quality SQL code including complex queries, stored procedures, and views, with a focus on performance tuning and efficient data access patterns\r\nCreate and manage workflow orchestration using Apache Airflow or similar tools, ensuring reliable scheduling, dependency management, and monitoring\r\nImplement and enforce data governance and metadata standards through tools such as Microsoft Purview, including data lineage, classification, cataloging, and security policies\r\nBuild automated data quality and validation frameworks to ensure accuracy, completeness, and reliability of production datasets\r\nCollaborate with cross functional teams including data architects, analysts, scientists, and business stakeholders to understand requirements and deliver scalable, well designed data solutions\r\nLead technical design sessions and code reviews, promoting engineering best practices, reusability, and maintainability\r\nSupport cloud infrastructure and DevOps practices, including CI/CD pipelines, version control, testing automation, and environment management\r\nMonitor and troubleshoot production data pipelines, proactively addressing issues, performance bottlenecks, and system failures\r\nContribute to the evolution of the enterprise data platform, recommending tools, frameworks, and architectures to improve scalability and efficiency\r\nRequired Qualifications\r\n5+ years of experience in data engineering, software engineering, or similar disciplines\r\nHands-on experience with Databricks or Snowflake\r\nExperience with orchestration tools such as Apache Airflow\r\nExperience working with cloud ecosystems (Azure preferred; AWS/GCP acceptable)\r\nAdvanced SQL skills and experience with OLTP and OLAP data modeling\r\nSolid understanding of modern data warehousing, data lake, and ELT/ETL design patterns\r\nFamiliarity with data governance tools, especially Microsoft Purview\r\nSolid programming expertise in Python, PySpark, or similar languages\r\nPreferred Qualifications\r\nHealthcare industry experience, including claims, clinical, FHIR, HL7, or provider data\r\nExperience with containerization (Docker, Kubernetes) for data workloads\r\nExperience supporting machine learning workflows or analytical data science pipelines\r\nKnowledge of distributed computing concepts and performance tuning\r\nJ-18808-Ljbffr","company":"Vytwo","rawCompany":"vytwo","city":"Prosper","state":"TX","isRemote":false,"isActive":false,"createdAt":"2026-04-09T15:39:50.067Z","occupations":[{"code":"15-1243.01","title":"Data Warehousing Specialists","slug":"data-warehousing-specialists"},{"code":"15-1243.00","title":"Database Architects","slug":"database-architects"},{"code":"15-2051.00","title":"Data Scientists","slug":"data-scientists"}],"industries":[{"code":"541512","title":"Computer Systems Design Services","slug":"computer-systems-design-services"},{"code":"541511","title":"Custom Computer Programming Services","slug":"custom-computer-programming-services"},{"code":"513210","title":"Software Publishers","slug":"software-publishers"}],"jobPosting":{"@context":"https://schema.org","@type":"JobPosting","title":"Senior Data Engineer INDIA","description":"Senior Data Engineer\r\nRemote Work: INDIA *Only Consultants local to INDIA are eligible.\r\n*No visa Sponsorship\r\nPrimary Responsibilities\r\nDesign, develop, and maintain scalable data pipelines using Python, PySpark, and other modern programming languages to support both batch and streaming workloads\r\nBuild and optimize data processing frameworks on cloud platforms such as Databricks or Snowflake, ensuring performance, reliability, and cost efficiency\r\nDesign and implement robust data models, including transactional (OLTP) and dimensional (OLAP) schemas, to support analytics, reporting, and application integration\r\nDevelop high quality SQL code including complex queries, stored procedures, and views, with a focus on performance tuning and efficient data access patterns\r\nCreate and manage workflow orchestration using Apache Airflow or similar tools, ensuring reliable scheduling, dependency management, and monitoring\r\nImplement and enforce data governance and metadata standards through tools such as Microsoft Purview, including data lineage, classification, cataloging, and security policies\r\nBuild automated data quality and validation frameworks to ensure accuracy, completeness, and reliability of production datasets\r\nCollaborate with cross functional teams including data architects, analysts, scientists, and business stakeholders to understand requirements and deliver scalable, well designed data solutions\r\nLead technical design sessions and code reviews, promoting engineering best practices, reusability, and maintainability\r\nSupport cloud infrastructure and DevOps practices, including CI/CD pipelines, version control, testing automation, and environment management\r\nMonitor and troubleshoot production data pipelines, proactively addressing issues, performance bottlenecks, and system failures\r\nContribute to the evolution of the enterprise data platform, recommending tools, frameworks, and architectures to improve scalability and efficiency\r\nRequired Qualifications\r\n5+ years of experience in data engineering, software engineering, or similar disciplines\r\nHands-on experience with Databricks or Snowflake\r\nExperience with orchestration tools such as Apache Airflow\r\nExperience working with cloud ecosystems (Azure preferred; AWS/GCP acceptable)\r\nAdvanced SQL skills and experience with OLTP and OLAP data modeling\r\nSolid understanding of modern data warehousing, data lake, and ELT/ETL design patterns\r\nFamiliarity with data governance tools, especially Microsoft Purview\r\nSolid programming expertise in Python, PySpark, or similar languages\r\nPreferred Qualifications\r\nHealthcare industry experience, including claims, clinical, FHIR, HL7, or provider data\r\nExperience with containerization (Docker, Kubernetes) for data workloads\r\nExperience supporting machine learning workflows or analytical data science pipelines\r\nKnowledge of distributed computing concepts and performance tuning\r\nJ-18808-Ljbffr","datePosted":"2026-04-09T15:39:50.067Z","dateModified":"2026-04-09T15:39:50.067Z","hiringOrganization":{"@type":"Organization","name":"Vytwo","sameAs":"https://jobsearcher.com"},"jobLocation":{"@type":"Place","address":{"@type":"PostalAddress","addressLocality":"Prosper","addressRegion":"TX","addressCountry":"US"}},"identifier":{"@type":"PropertyValue","name":"JobSearcher","value":"a524e9fabd4abb93fbc2e54a"},"url":"https://jobsearcher.com/jobs/a524e9fabd4abb93fbc2e54a"}}