JOBSEARCHER

Hardware Engineer

CDWFort Worth, TXJune 4th, 2026
12 MONTH CONTRACTONSITE 3 DAYS PER WEEKLOCATION: DALLAS, TXOperations focused Hardware Systems Engineer supporting a large-scale bare-metal server environment (~17,000 servers) with a heavy emphasis on CPU and GPU compute availability. This role is centered on reliability, automation, and operational excellence — digging into systems and pipelines when things break and improving them so they break less often. (Not hands-on in Data center)What you’ll be doingAdminister and support large-scale bare-metal server infrastructure, primarily HPE and Dell platformsPerform server break/fix troubleshooting including hardware faults, firmware/BIOS/BMC issues, POST failures, degraded components, and system instabilityManage server lifecycle operations: onboarding, provisioning, firmware updates, BIOS/BMC configuration, and hardware refresh kitsOwn incident response and break/fix workflows while maintaining 98%+ compute availability SLAsWork cross?functionally with Data Center and Networking teams during hardware incidents, including ticket creation, repair coordination, and log collectionInterface directly with HPE and Dell vendors: gathering diagnostics, sending logs, driving RMAs, and tracking issues through resolutionSupport and troubleshoot CI/CD and automation pipelines used for server provisioning, configuration, and lifecycle managementDig into automation code and workflows (Ansible, scripts, pipelines) when jobs fail to understand root cause and unblock deploymentsIdentify recurring operational issues and contribute to process improvements, runbooks, and reliability enhancementsHelp manage and reduce the operations backlog, prioritizing fixes, cleanup, and automation improvementsMust Have: Hands?on experience supporting HPE and Dell servers in production, including break/fix and hardware incident troubleshootingExperience with HPE iLO, Dell iDRAC, and related BMC environmentsStrong understanding of server hardware components (CPU, GPU, memory, disks, NICs, power) and common failure modesExperience troubleshooting automation and CI/CD pipelines that manage infrastructure (not just running them, but fixing them when they fail)Operational mindset with experience owning incidents, SLAs, backlog items, and process improvementsAutomation experience with Ansible, Bash, Jenkins, or similar toolingExposure to GPU dense, HPC, or high-performance compute environmentsExperience improving runbooks, reducing toil, and scaling operations through automation