Scientific Data Architect
Scientific Data Architect — Indianapolis, IN 📍 Indianapolis, IN | Full-Time | HybridThe Role We're looking for a product-minded, outcome-obsessed Scientific Data Architect to join a high-impact team at the intersection of life sciences R&D and AI. You'll work directly with scientific and technical stakeholders onsite a few days per week in the Indianapolis area, translating complex scientific data challenges into scalable, AI-ready solutions.What You'll DoDesign and implement extensible, reusable data models (tabular and JSON) that capture and organize scientific data at scaleDevelop Python-based parsers to programmatically interrogate proprietary instrument output filesIntegrate lab software (ELN/LIMS) via APIs and build data visualization apps using Streamlit, Plotly, and similar frameworksCollaborate with scientists, engineers, and product managers to develop and deploy ML, AI, and statistical modelsRapidly prototype and demo solutions directly with end users to accelerate adoptionContribute to product roadmap by translating customer pain points into actionable prioritiesTravel to client sites in the Indianapolis, St. Louis, and Chicago regions as neededWhat We're Looking ForPhD with 4+ years or MS with 8+ years of industry experience in life sciencesDeep domain knowledge in drug discovery, preclinical development, CMC, or product quality testingProven track record designing and implementing AI/ML-driven use cases in cloud environmentsHands-on Python development experience including data modeling, parsing, and app developmentExperience integrating ELN/LIMS systems via APIsStrong communication and storytelling skills — comfortable engaging scientists and executive stakeholders alikeSelf-starter mentality with a bias toward prototyping and actionBonus PointsExperience with Streamlit, Plotly, Holoviews, or similar data app frameworksFamiliarity with AWS or other cloud-native environmentsBackground in scientific consulting or customer-facing rolesExperience with exploratory data analysis across complex biopharma datasets