Data Engineer || Python + SQL + AWS/Databricks (Only W2)
ARCHIVED
We can't find an active application page for this role right now. It may reopen or be listed elsewhere. Use Next Steps to search for an active apply link and similar live jobs.
A sustainability-focused AI research lab is adding a Data Engineer to build the data systems behind its forecasting, analytics, and LLM-driven research. You will turn fragmented sources into reliable, scalable datasets that power predictive modeling and scenario analysis.๐๐ข๐ ๐ฃ๐๐ก๐ฌPave Talent is hiring on behalf of our client, a well-funded research institute working at the intersection of AI, energy systems, and decarbonization strategy. The team blends machine learning, energy modeling, and real-world data to inform decisions on electrification, EV adoption, grid interaction, and circularity. Based in the U.S. with a global scope.๐ง๐๐ ๐ข๐ฃ๐ฃ๐ข๐ฅ๐ง๐จ๐ก๐๐ง๐ฌYou will define how data flows across the team and partner directly with ML researchers, technical leads, and research scientists. This is a hands-on build role in a fast-moving research environment where rapid experimentation matters as much as clean engineering.- Design, build, and maintain scalable pipelines for structured, semi-structured, and unstructured data- Develop data models and datasets for predictive modeling, scenario analysis, and LLM-based workflows- Improve data tooling and automation to speed up prototyping and research iteration- Integrate third-party APIs, external datasets, and domain-specific global data sources- Set standards for data quality, lineage, governance, and reproducibility- Support exploratory data analysis to validate assumptions, find data gaps, and improve model inputs- Partner with internal and external stakeholders on secure data access and governance๐ค๐จ๐๐๐๐๐๐๐๐ง๐๐ข๐ก๐ฆ๐ฅ๐ฒ๐พ๐๐ถ๐ฟ๐ฒ๐ฑ:- Bachelor's degree in a quantitative field (engineering, computer science, data science, or related)- 3 to 5 years in data engineering or software engineering with a strong data focus- Strong proficiency in Python, SQL, Unix tooling, and Git-based workflows- Proven track record building pipelines across heterogeneous data sources- Experience with cloud services, preferably AWS and Databricks- Experience integrating external APIs and third-party datasets- Experience with enterprise big data, ETL frameworks, and data warehousing concepts- A background in analytics, experimentation, or statistical analysis๐๐ผ๐ป๐๐ ๐ฃ๐ผ๐ถ๐ป๐๐:- Experience with automotive, manufacturing, mobility, or energy systems data- Experience with AI/ML model development, including LLM-driven or generative/agentic AI workflows๐๐ข๐ ๐ฃ๐๐ก๐ฆ๐๐ง๐๐ข๐ก ๐๐ก๐ ๐๐๐ก๐๐๐๐ง๐ฆ๐ฅ๐ฎ๐๐ฒ: $70 to $95 per hour, depending on experience๐๐ผ๐ฐ๐ฎ๐๐ถ๐ผ๐ป: On-site in Los Altos, CA. Must be local or able to commute.๐ช๐ผ๐ฟ๐ธ ๐ฎ๐๐๐ต๐ผ๐ฟ๐ถ๐๐ฎ๐๐ถ๐ผ๐ป: Must be authorized to work in the U.S. No visa sponsorship or corp-to-corp.Interested? Apply via LinkedIn and we'll be in touch. Confidential search; your application is fully private.๐ฃ๐ฎ๐๐ฒ ๐ง๐ฎ๐น๐ฒ๐ป๐ | ๐๐ถ๐ฟ๐ถ๐ป๐ด ๐ฅ๐ฒ๐ถ๐บ๐ฎ๐ด๐ถ๐ป๐ฒ๐ฑ