Technical Program Manager
About the CompanyGMI Cloud is building next-generation AI infrastructure designed for large-scale GPU training and inference workloads. Our platform supports high-density GPU clusters deployed in modern data centers across multiple regions.About the RoleWe are looking for a Junior Technical Program Manager (TPM) to drive the deployment and delivery of GPU cluster infrastructure. This role will work at the intersection of AI hardware platforms, high-performance networking, and data center infrastructure, coordinating across solution architects, engineering teams, vendors, and contractors to deliver production-ready AI clusters.ResponsibilitiesRole OverviewAs a Junior TPM at GMI Cloud, you will be at the heart of our AI infrastructure expansion. You will coordinate the end-to-end deployment of cutting-edge GPU clusters—from the first blueprint to the final production launch. You will be the "glue" connecting Solution Architects, network engineers, and field contractors to ensure our high-performance computing (HPC) environments are delivered on time and at scale.Key ResponsibilitiesCluster Deployment Orchestration: Drive the end-to-end lifecycle of AI GPU clusters, ensuring hardware, network, and power/cooling readiness align with production timelines.Architecture Partnership: Collaborate with Infrastructure Solution Architects (SA) to support GPU server selection, network topology design, and the development of Bills of Materials (BOM).HPC Integration: Oversee rack elevation planning, GPU server configuration, and the implementation of high-speed interconnects (InfiniBand/RoCE).Field & Vendor Coordination: Partner with General Contractors (GC) and system integrators to manage on-site logistics, cabling, and hardware staging.Validation & Benchmarking: Coordinate cluster bring-up activities, including multi-node GPU validation and stress testing (P2P/RDMA) to ensure peak performance.Operational Handover: Develop runbooks and ensure monitoring/telemetry are integrated before handing off to the Operations team.What We’re Looking ForQualifications & Experience:2+ years of experience in Technical Program Management, Infrastructure Delivery, or Systems Engineering.Exposure to Data Center hardware or High-Performance Computing (HPC) environments.Basic understanding of GPU server architecture and the components that make up a distributed cluster.Ability to manage multi-vendor projects and navigate complex delivery timelines.Technical Familiarity (Bonus points if you know):High-speed networking (InfiniBand / RoCE / RDMA).Rack elevation and high-density power/cooling requirements.Experience with NVIDIA AI infrastructure platforms or liquid-cooling technologies.💡 A Note on Requirements (Read This First!)"We don't expect you to check every single box above."At GMI Cloud, we value grit, curiosity, and the ability to learn fast over a perfect resume. If you have a solid foundation in infrastructure, a passion for AI hardware, and the confidence to master new technologies on the fly, we want to hear from you. > If you meet some of the requirements and are excited about building the future of AI infrastructure, please don't hesitate to apply!