{"schemaVersion":"jobsearcher.job.v1","id":"6d2d7549351d7d45dbe5c965","url":"https://jobsearcher.com/jobs/6d2d7549351d7d45dbe5c965","canonicalUrl":"https://jobsearcher.com/jobs/6d2d7549351d7d45dbe5c965","title":"Data Engineer || Python + SQL + AWS/Databricks (Only W2)","description":"A sustainability-focused AI research lab is adding a Data Engineer to build the data systems behind its forecasting, analytics, and LLM-driven research. You will turn fragmented sources into reliable, scalable datasets that power predictive modeling and scenario analysis.𝗖𝗢𝗠𝗣𝗔𝗡𝗬Pave Talent is hiring on behalf of our client, a well-funded research institute working at the intersection of AI, energy systems, and decarbonization strategy. The team blends machine learning, energy modeling, and real-world data to inform decisions on electrification, EV adoption, grid interaction, and circularity. Based in the U.S. with a global scope.𝗧𝗛𝗘 𝗢𝗣𝗣𝗢𝗥𝗧𝗨𝗡𝗜𝗧𝗬You will define how data flows across the team and partner directly with ML researchers, technical leads, and research scientists. This is a hands-on build role in a fast-moving research environment where rapid experimentation matters as much as clean engineering.- Design, build, and maintain scalable pipelines for structured, semi-structured, and unstructured data- Develop data models and datasets for predictive modeling, scenario analysis, and LLM-based workflows- Improve data tooling and automation to speed up prototyping and research iteration- Integrate third-party APIs, external datasets, and domain-specific global data sources- Set standards for data quality, lineage, governance, and reproducibility- Support exploratory data analysis to validate assumptions, find data gaps, and improve model inputs- Partner with internal and external stakeholders on secure data access and governance𝗤𝗨𝗔𝗟𝗜𝗙𝗜𝗖𝗔𝗧𝗜𝗢𝗡𝗦𝗥𝗲𝗾𝘂𝗶𝗿𝗲𝗱:- Bachelor's degree in a quantitative field (engineering, computer science, data science, or related)- 3 to 5 years in data engineering or software engineering with a strong data focus- Strong proficiency in Python, SQL, Unix tooling, and Git-based workflows- Proven track record building pipelines across heterogeneous data sources- Experience with cloud services, preferably AWS and Databricks- Experience integrating external APIs and third-party datasets- Experience with enterprise big data, ETL frameworks, and data warehousing concepts- A background in analytics, experimentation, or statistical analysis𝗕𝗼𝗻𝘂𝘀 𝗣𝗼𝗶𝗻𝘁𝘀:- Experience with automotive, manufacturing, mobility, or energy systems data- Experience with AI/ML model development, including LLM-driven or generative/agentic AI workflows𝗖𝗢𝗠𝗣𝗘𝗡𝗦𝗔𝗧𝗜𝗢𝗡 𝗔𝗡𝗗 𝗕𝗘𝗡𝗘𝗙𝗜𝗧𝗦𝗥𝗮𝘁𝗲: $70 to $95 per hour, depending on experience𝗟𝗼𝗰𝗮𝘁𝗶𝗼𝗻: On-site in Los Altos, CA. Must be local or able to commute.𝗪𝗼𝗿𝗸 𝗮𝘂𝘁𝗵𝗼𝗿𝗶𝘇𝗮𝘁𝗶𝗼𝗻: Must be authorized to work in the U.S. No visa sponsorship or corp-to-corp.Interested? Apply via LinkedIn and we'll be in touch. Confidential search; your application is fully private.𝗣𝗮𝘃𝗲 𝗧𝗮𝗹𝗲𝗻𝘁 | 𝗛𝗶𝗿𝗶𝗻𝗴 𝗥𝗲𝗶𝗺𝗮𝗴𝗶𝗻𝗲𝗱","company":"Pave Talent","rawCompany":"pave talent","city":"Los Altos","state":"CA","isRemote":false,"isActive":false,"createdAt":"2026-06-06T11:47:21.093Z","occupations":[{"code":"15-2051.00","title":"Data Scientists","slug":"data-scientists"},{"code":"15-1243.01","title":"Data Warehousing Specialists","slug":"data-warehousing-specialists"},{"code":"15-1252.00","title":"Software Developers","slug":"software-developers"}],"industries":[{"code":"541512","title":"Computer Systems Design Services","slug":"computer-systems-design-services"},{"code":"541690","title":"Other Scientific and Technical Consulting Services","slug":"other-scientific-and-technical-consulting-services"},{"code":"541511","title":"Custom Computer Programming Services","slug":"custom-computer-programming-services"}],"jobPosting":{"@context":"https://schema.org","@type":"JobPosting","title":"Data Engineer || Python + SQL + AWS/Databricks (Only W2)","description":"A sustainability-focused AI research lab is adding a Data Engineer to build the data systems behind its forecasting, analytics, and LLM-driven research. You will turn fragmented sources into reliable, scalable datasets that power predictive modeling and scenario analysis.𝗖𝗢𝗠𝗣𝗔𝗡𝗬Pave Talent is hiring on behalf of our client, a well-funded research institute working at the intersection of AI, energy systems, and decarbonization strategy. The team blends machine learning, energy modeling, and real-world data to inform decisions on electrification, EV adoption, grid interaction, and circularity. Based in the U.S. with a global scope.𝗧𝗛𝗘 𝗢𝗣𝗣𝗢𝗥𝗧𝗨𝗡𝗜𝗧𝗬You will define how data flows across the team and partner directly with ML researchers, technical leads, and research scientists. This is a hands-on build role in a fast-moving research environment where rapid experimentation matters as much as clean engineering.- Design, build, and maintain scalable pipelines for structured, semi-structured, and unstructured data- Develop data models and datasets for predictive modeling, scenario analysis, and LLM-based workflows- Improve data tooling and automation to speed up prototyping and research iteration- Integrate third-party APIs, external datasets, and domain-specific global data sources- Set standards for data quality, lineage, governance, and reproducibility- Support exploratory data analysis to validate assumptions, find data gaps, and improve model inputs- Partner with internal and external stakeholders on secure data access and governance𝗤𝗨𝗔𝗟𝗜𝗙𝗜𝗖𝗔𝗧𝗜𝗢𝗡𝗦𝗥𝗲𝗾𝘂𝗶𝗿𝗲𝗱:- Bachelor's degree in a quantitative field (engineering, computer science, data science, or related)- 3 to 5 years in data engineering or software engineering with a strong data focus- Strong proficiency in Python, SQL, Unix tooling, and Git-based workflows- Proven track record building pipelines across heterogeneous data sources- Experience with cloud services, preferably AWS and Databricks- Experience integrating external APIs and third-party datasets- Experience with enterprise big data, ETL frameworks, and data warehousing concepts- A background in analytics, experimentation, or statistical analysis𝗕𝗼𝗻𝘂𝘀 𝗣𝗼𝗶𝗻𝘁𝘀:- Experience with automotive, manufacturing, mobility, or energy systems data- Experience with AI/ML model development, including LLM-driven or generative/agentic AI workflows𝗖𝗢𝗠𝗣𝗘𝗡𝗦𝗔𝗧𝗜𝗢𝗡 𝗔𝗡𝗗 𝗕𝗘𝗡𝗘𝗙𝗜𝗧𝗦𝗥𝗮𝘁𝗲: $70 to $95 per hour, depending on experience𝗟𝗼𝗰𝗮𝘁𝗶𝗼𝗻: On-site in Los Altos, CA. Must be local or able to commute.𝗪𝗼𝗿𝗸 𝗮𝘂𝘁𝗵𝗼𝗿𝗶𝘇𝗮𝘁𝗶𝗼𝗻: Must be authorized to work in the U.S. No visa sponsorship or corp-to-corp.Interested? Apply via LinkedIn and we'll be in touch. Confidential search; your application is fully private.𝗣𝗮𝘃𝗲 𝗧𝗮𝗹𝗲𝗻𝘁 | 𝗛𝗶𝗿𝗶𝗻𝗴 𝗥𝗲𝗶𝗺𝗮𝗴𝗶𝗻𝗲𝗱","datePosted":"2026-06-06T11:47:21.093Z","dateModified":"2026-06-06T11:47:21.093Z","hiringOrganization":{"@type":"Organization","name":"Pave Talent","sameAs":"https://jobsearcher.com"},"jobLocation":{"@type":"Place","address":{"@type":"PostalAddress","addressLocality":"Los Altos","addressRegion":"CA","addressCountry":"US"}},"identifier":{"@type":"PropertyValue","name":"JobSearcher","value":"6d2d7549351d7d45dbe5c965"},"url":"https://jobsearcher.com/jobs/6d2d7549351d7d45dbe5c965"}}