JOBSEARCHER

Senior Data Engineer

Job Title: Senior Data Engineer – Scientific & R&D Data Platforms Location: Upper Providence, PA Type: 12 Month Contract Hours: Standard business hours Compensation: $72 – $91/hourOverviewA leading science‑driven organization is seeking a Senior Data Engineer to design and deliver a new enterprise data product supporting generative drug design and computational chemistry platforms. This role is focused on building scalable, well‑structured data architecture from the ground up, with long‑term expansion and downstream AI/ML integration in mind.The ideal candidate brings strong data engineering fundamentals, hands‑on cloud experience, and an understanding of scientific or chemistry‑driven data workflows within life sciences or R&D environments.Key ResponsibilitiesDesign and implement a new enterprise data product, initially delivered as a standalone solution with future integration into AI‑driven drug discovery platformsBuild scalable data pipelines, schemas, and storage models to support large, complex scientific and chemistry‑derived datasetsDevelop and maintain data solutions primarily on GCP and BigQuery, aligned with enterprise engineering standardsImplement data transformations and pipelines using Python, with an emphasis on data quality, traceability, and performanceEnsure the data architecture supports future expansion, additional datasets, and evolving analytical and computational needsPartner closely with computational chemists, data scientists, and ML engineers to align data models with generative design workflows and ML outputsApply drug design and chemistry concepts (molecular properties, structure‑activity data, experimental results) to inform data modeling decisionsProvide technical guidance around scalability, data structure, and long‑term maintainability in an enterprise environmentRequired SkillsStrong experience in data engineering, including database design, schema modeling, and data product architectureHands‑on experience with GCP and BigQuery (Postgres familiarity is a plus)Proficiency in Python for building and maintaining data pipelinesOnyx platform or ecosystem experienceExperience working with large, complex datasets at scale, ideally in scientific or R&D settingsBackground in life sciences, pharma, or scientific data platformsPlussesExperience supporting downstream analytics, ML pipelines, or AI‑driven platformsExposure to generative design, discovery platforms, or computational research environmentsWorking knowledge of drug design, chemistry, or computational chemistry data