JOBSEARCHER

Data Center Engineer

Senior Data Center Operations Engineer / Lead Hardware Engineer (GPU Infrastructure)We’re working with a high-growth AI infrastructure provider building the compute, data centers, and power systems underpinning next-generation artificial intelligence.This team is deploying and operating hyperscale, GPU-dense environments for some of the most advanced AI workloads globally. The environment is fast-paced, highly technical, and focused on delivering reliable, scalable infrastructure at speed.Locations open: Abernathy, Barber Lake, BuffaloThe RoleAs a Data Center Operations Engineer, you’ll take full ownership of onsite operations within a high-performance compute environment. You’ll be responsible for the deployment, maintenance, and reliability of GPU-based infrastructure, supporting critical AI workloads.This is a hands-on role working close to the hardware, where you’ll act as a first responder for incidents, support ongoing scaling efforts, and ensure operational excellence across the data center. You must have GPU experience and have worked in a Senior Capacity.Key ResponsibilitiesInstall, deploy, and configure server and network hardware, with a focus on GPU-based systemsTroubleshoot and maintain GPU servers (e.g. H100, B200, GB200 or similar) in production environmentsPerform hardware replacements (servers, components, networking gear) while maintaining accurate asset trackingSupport network troubleshooting, including cabling diagnostics (copper/fibre) and device-level issuesAct as an onsite incident responder, coordinating with remote engineering teams and SMEsOwn and resolve operational tickets, escalating where needed while maintaining high SLAsSupport 24/7 operations via shift patterns or on-call rotationsCollaborate with internal teams, vendors, and customers to support ongoing deployments and improvementsRequirementsHands-on experience working with GPU servers in production environments (essential)Exposure to NVIDIA-based systems such as H100, B200, A100, GB200 or similarStrong experience in server hardware troubleshootingPOST, BIOS, PXE boot, IPMI, BMC, etc.Solid understanding of networking fundamentalsTCP/IP, Ethernet, switching, routing, cabling (copper & fibre)Working knowledge of Linux systems administrationExperience operating in data center or hardware-intensive environmentsAbility to work in fast-paced, high-availability environments with shifting prioritiesNice to HaveExperience in hyperscale or HPC environmentsBackground in electrical, mechanical, or related engineering disciplinesExperience working with vendors and managing hardware lifecycle projectsStrong communication skills and ability to collaborate across technical teamsPackageCompetitive salary + equityPension / retirement planPrivate healthcare (including dental and vision where applicable)Generous PTO