Bioinformatics Summer Intern
Company DescriptionLocated in the San Francisco Bay Area, El Capitan Biosciences comprises a team of passionate, versatile, and seasoned professionals. Our team's extensive research and industry backgrounds enable us to stay at the forefront of innovation.At El Capitan Biosciences (ECB), our mission is to advance noninvasive diagnostics and monitoring by harnessing advanced machine learning algorithms and proprietary molecular methods to extract, quantify and interpret human RNA present in stool. By enabling whole transcriptome-sequencing of human transcripts shed from gut and immune cell populations, we aim to discover clinically meaningful biomarkers that reflect mucosal integrity, presence of disease, inflammatory activity and broader gastrointestinal health. Our multiplex RNA-based panels integrate signals across multiple pathways to deliver a comprehensive, actionable picture of disease biology, supporting improved diagnosis, monitoring, and overall understanding of gut health.Company website: https://elcapitanbio.com/Why Join UsOpportunity to help shape the commercial direction of a novel platform technologyDirect impact on partnerships with leading pharma and diagnostics organizationsCollaborative, science-driven startup environmentJob DescriptionPosition OverviewWe are seeking a highly motivated Bioinformatics Summer Intern to support the development of RNA-seq data analysis pipelines and machine learning models for biomarker discovery and clinical diagnostics. This role is ideal for candidates interested in translational genomics, AI in healthcare, and next-generation diagnostics.Schedule: 40hrs/week, 12 weeksHourly rate: $35-$42.5/hrKey ResponsibilitiesAnalyze RNA-seq datasets, including preprocessing, QC, normalization, and differential expression analysisDevelop and optimize bioinformatics pipelines for transcriptomic data (bulk RNA-seq and/or degraded RNA samples)Apply machine learning and statistical methods to identify diagnostic or predictive biomarkersAssist in building and evaluating predictive models, including deep learning approachesExplore integration of large language models (LLMs) or multimodal AI approaches for biological data interpretationCollaborate with wet-lab scientists and cross-functional teams to interpret results and refine experimental designPresent findings through reports, visualizations, and internal presentationsQualificationsBasic QualificationsCurrent PhD student, recent PhD graduate, or postdoctoral researcher in Bioinformatics, Computational Biology, Biology, or a related fieldHands-on experience with RNA-seq data analysisStrong programming skills in at least one language (e.g., R or Python)Familiarity with Linux/Unix environments and proficiency in bash scriptingFormal training in statistics and machine learningPreferred QualificationsExperience with machine learning frameworks (e.g., scikit-learn, TensorFlow, PyTorch)Experience building or applying deep learning modelsFamiliarity with large language models (LLMs) or generative AI applications in biologyExperience with NGS pipelines and tools (e.g., STAR, Salmon, nf-core, etc.)Experience working in high-performance computing (HPC) or cloud environmentsStrong data visualization and communication skillsAdditional InformationWhat You’ll GainHands-on experience working with cutting-edge RNA-seq datasets from clinical samplesExposure to real-world biomarker discovery and diagnostic developmentMentorship from experienced scientists in bioinformatics and translational genomicsOpportunity to contribute to impactful healthcare technologies