JOBSEARCHER

PhD Intern: HPC & LLM Inference for Distributed Systems

A research institution is seeking a PhD intern for Spring 2026 with a focus on High Performance Computing and Inference of Large Language Models. The candidate will engage in designing efficient cache management and developing strategies for LLM inference. Preferred qualifications include experience with open-source inference engines and GPU profilers. This internship offers flexibility in duration with a minimum of 3 months and can be remote or onsite. J-18808-Ljbffr