Data Engineer
Open to both Austin, Texas and Cupertino, California locations!Description:We are seeking experienced Data Engineers to join a high-impact team focused on building and optimizing modern data platforms. This role is ideal for hands-on engineers who thrive in fast-paced environments and can independently design, develop, and maintain scalable data solutions.You will play a critical role in enabling data-driven decision-making by developing robust pipelines, ensuring data quality, and supporting advanced analytics and machine learning initiatives. This is a highly technical position requiring strong coding ability, problem-solving skills, and real-world experience with large-scale data systems.Qualifications:6+ years of hands-on data engineering experienceStrong programming skills in Python and SQL (Scala is highly preferred)Spark / PySparkKafkaAirflow or similar orchestration toolsDocker & Kubernetes (required)Spark architecture and performance tuningData warehousing concepts and best practicesPipeline design and optimizationExperience with cloud platforms (AWS, Azure, or GCP)Ability to troubleshoot complex data and system issuesStrong communication skills and ability to work independentlyPreferred Qualifications:Experience with data modeling and lakehouse architecturesFamiliarity with CI/CD, data observability, and infrastructure-as-codeExposure to ML pipelines or advanced analytics workflowsExperience with tools such as Snowflake, Databricks, or similar platformsResponsibilities:Design and build scalable batch and near real-time data pipelinesDevelop and optimize ETL/ELT workflows for performance and cost efficiencyWork with Spark-based systems, including understanding job execution and optimizationDesign and implement data models (e.g., star schema, medallion architecture)Support machine learning and advanced data use cases (feature engineering, retraining workflows, etc.)Ensure data quality, governance, privacy, and system reliabilityTroubleshoot production issues and handle real-world incident management scenariosCollaborate on data pipeline orchestration and job schedulingContinuously improve system performance and scalability