{"schemaVersion":"jobsearcher.job.v1","id":"58f31278fee486f447621741","url":"https://jobsearcher.com/jobs/58f31278fee486f447621741","canonicalUrl":"https://jobsearcher.com/jobs/58f31278fee486f447621741","title":"Data Center Facility Telemetry & Controls Engineer","description":"Location\nMultiple Sites — US (Remote/Hybrid eligible)\n\nRole Overview\nThe Data Center Facility Telemetry & Controls Management Engineer is a critical technical role responsible for the design, deployment, integration, and ongoing operation of Building Management Systems (BMS), Data Center Infrastructure Management (DCIM) platforms, and facility telemetry pipelines across Lambda's growing data center portfolio. This engineer ensures that all facility systems — power, cooling, thermal, and environmental — are continuously monitored, alarmed, and controllable in real time, supporting the safe and efficient operation of high-density GPU deployments at rack densities of 136–380 kW per rack.\n\nWhat You’ll Do\nBMS / Controls Architecture & Integration\n\nArchitect and manage BMS integration across colocation and Lambda-owned facilities, covering chillers, CRAHs, CDUs (Coolant Distribution Units), cooling towers, UPS systems, PDUs, and automatic transfer switches.\n\nDefine standards for BMS point lists, naming conventions, control sequences, and integration protocols (BACnet, Modbus, SNMP, OPC-UA, RESTful APIs).\n\nOversee commissioning and acceptance testing of new BMS deployments and CDU/TCS loop integrations for next-generation liquid-cooled GPU rack systems.\n\nCollaborate with colocation partners (Equinix, Digital Realty, and others) to ensure telemetry data flows from provider BMS/EPMS into Lambda's monitoring stack.\n\nDCIM & Telemetry Platform Management\n\nOwn the DCIM platform strategy and roadmap — evaluating, selecting, and implementing tooling for asset management, capacity planning, environmental monitoring, and power chain visibility.\n\nDevelop and maintain real-time dashboards for PUE, thermal performance, stranded capacity, and cooling system efficiency across all Lambda sites.\n\nBuild and maintain telemetry pipelines ingesting data from BMS, PDUs, in-rack sensors, CDUs, and network devices into centralized monitoring and alerting platforms (e.g., Prometheus, Grafana, InfluxDB, or equivalent).\n\nDefine alarm thresholds and escalation workflows for critical facility events including high coolant temperatures, CDU inlet/outlet anomalies, leak detection, and power exceedances.\n\nLiquid Cooling Controls & High-Density Operations\n\nDevelop control strategies and setpoint frameworks for TCS (Thermal Control System) loops supporting direct liquid cooling at densities of 220–380 kW per rack.\n\nEvaluate and qualify CDU vendors on controls integration capabilities, telemetry exposure, and remote management interfaces.\n\nDefine and enforce operational procedures for CDU commissioning, setpoint changes, loop pressure management, and fluid quality monitoring.\n\nSupport design and construction coordination for liquid cooling infrastructure in new data center buildouts, ensuring BMS and controls readiness at Day 1.\n\nOperational Reliability & Incident Response\n\nEstablish and maintain facility event management processes, including on-call response protocols for facility telemetry anomalies.\n\nLead root cause analysis for facility system failures and implement corrective actions to prevent recurrence.\n\nPartner with the data center operations team to maintain and refine emergency response runbooks tied to BMS alerts and automated controls.\n\nDrive continuous improvement in MTTR for facility-related events through better telemetry coverage and automated remediation.\n\nVendor & Stakeholder Management\n\nManage BMS integrators, DCIM vendors, and control subcontractors — from RFP through design, installation, commissioning, and ongoing support.\n\nServe as the primary technical interface with colocation providers on all BMS/EPMS integration topics.\n\nCollaborate with Lambda's infrastructure engineering, construction, and procurement teams to align controls requirements with facility buildout timelines.\n\nSupport due diligence and technical evaluation for new colocation sites and modular data center deployments from a telemetry and controls readiness perspective.\n\nYou\nRequired Experience\n\n7+ years of experience in data center infrastructure engineering, with at least 4 years focused on BMS, DCIM, or controls systems in a hyperscale, colocation, or AI/HPC environment.\n\nHands‑on experience designing and integrating BMS for mission‑critical facilities including UPS, PDU, CRAH/CRAC, chiller plant, cooling tower, and liquid cooling (CDU/in‑row) systems.\n\nStrong working knowledge of industrial control protocols: BACnet IP/MS‑TP, Modbus TCP/RTU, SNMP, DNP3, and modern API‑based integrations.\n\nDemonstrated experience with DCIM platforms (Nlyte, Sunbird, Vertiv TRELLIS, or equivalent) including deployment, configuration, and ongoing administration.\n\nExperience with real‑time telemetry stacks (Prometheus, InfluxDB, Grafana, or similar) applied to infrastructure monitoring use cases.\n\nStrong understanding of data center power and cooling systems, including PUE optimization, thermal management, and redundancy architectures (2N, N+1).\n\nPreferred Qualifications\n\nDirect experience with direct liquid cooling (DLC) systems, CDU controls integration, and TCS loop management for high‑density AI GPU deployments (100+ kW per rack).\n\nFamiliarity with OCP (Open Compute Project) hardware and telemetry standards.\n\nExperience working with major colocation providers (Equinix, Digital Realty, CyrusOne, etc.) on BMS/EPMS integration and data sharing agreements.\n\nExposure to modular or edge data center deployments and associated controls considerations.\n\nBackground in scripting and automation (Python, Ansible, Terraform) applied to infrastructure management workflows.\n\nExperience operating data centers at international scale, including Asia‑Pacific or Southeast Asian markets.\n\nRelevant certifications: CDCP, CDCE, ETA Data Center Specialist, or vendor‑specific BMS/controls certifications.\n\nWhat We Offer\n\nOpportunity to shape the telemetry and controls architecture for one of the fastest‑growing AI infrastructure platforms in the industry.\n\nWork with cutting‑edge GPU infrastructure at rack densities at the frontier of what the industry has deployed.\n\nCollaborative environment with experienced infrastructure, construction, and vendor teams across a rapidly scaling global portfolio.\n\nCompetitive compensation including salary, equity, and comprehensive benefits.\n\nFlexibility in work location with hybrid/remote options depending on facility portfolio needs.\n\nSalary Range Information\nThe annual salary range for this position has been set based on market data and other factors. However, a salary higher or lower than this range may be appropriate for a candidate whose qualifications differ meaningfully from those listed in the job description.\n\nEqual Opportunity Employer\nLambda is an Equal Opportunity employer. Applicants are considered without regard to race, color, religion, creed, national origin, age, sex, gender, marital status, sexual orientation and identity, genetic information, veteran status, citizenship, or any other factors prohibited by local, state, or federal law.\n\n#J-18808-Ljbffr","company":"Lambda","rawCompany":"lambda","city":"San Jose","state":"CA","isRemote":false,"isActive":true,"createdAt":"2026-06-26T05:53:58.817Z","occupations":[{"code":"15-1299.08","title":"Computer Systems Engineers/Architects","slug":"computer-systems-engineers-architects"},{"code":"11-3051.01","title":"Quality Control Systems Managers","slug":"quality-control-systems-managers"},{"code":"15-1241.01","title":"Telecommunications Engineering Specialists","slug":"telecommunications-engineering-specialists"}],"industries":[{"code":"541512","title":"Computer Systems Design Services","slug":"computer-systems-design-services"},{"code":"541513","title":"Computer Facilities Management Services","slug":"computer-facilities-management-services"},{"code":"518210","title":"Computing Infrastructure Providers, Data Processing, Web Hosting, and Related Services","slug":"computing-infrastructure-providers-data-processing-web-hosting-and-related-services"}],"jobPosting":{"@context":"https://schema.org","@type":"JobPosting","title":"Data Center Facility Telemetry & Controls Engineer","description":"Location\nMultiple Sites — US (Remote/Hybrid eligible)\n\nRole Overview\nThe Data Center Facility Telemetry & Controls Management Engineer is a critical technical role responsible for the design, deployment, integration, and ongoing operation of Building Management Systems (BMS), Data Center Infrastructure Management (DCIM) platforms, and facility telemetry pipelines across Lambda's growing data center portfolio. This engineer ensures that all facility systems — power, cooling, thermal, and environmental — are continuously monitored, alarmed, and controllable in real time, supporting the safe and efficient operation of high-density GPU deployments at rack densities of 136–380 kW per rack.\n\nWhat You’ll Do\nBMS / Controls Architecture & Integration\n\nArchitect and manage BMS integration across colocation and Lambda-owned facilities, covering chillers, CRAHs, CDUs (Coolant Distribution Units), cooling towers, UPS systems, PDUs, and automatic transfer switches.\n\nDefine standards for BMS point lists, naming conventions, control sequences, and integration protocols (BACnet, Modbus, SNMP, OPC-UA, RESTful APIs).\n\nOversee commissioning and acceptance testing of new BMS deployments and CDU/TCS loop integrations for next-generation liquid-cooled GPU rack systems.\n\nCollaborate with colocation partners (Equinix, Digital Realty, and others) to ensure telemetry data flows from provider BMS/EPMS into Lambda's monitoring stack.\n\nDCIM & Telemetry Platform Management\n\nOwn the DCIM platform strategy and roadmap — evaluating, selecting, and implementing tooling for asset management, capacity planning, environmental monitoring, and power chain visibility.\n\nDevelop and maintain real-time dashboards for PUE, thermal performance, stranded capacity, and cooling system efficiency across all Lambda sites.\n\nBuild and maintain telemetry pipelines ingesting data from BMS, PDUs, in-rack sensors, CDUs, and network devices into centralized monitoring and alerting platforms (e.g., Prometheus, Grafana, InfluxDB, or equivalent).\n\nDefine alarm thresholds and escalation workflows for critical facility events including high coolant temperatures, CDU inlet/outlet anomalies, leak detection, and power exceedances.\n\nLiquid Cooling Controls & High-Density Operations\n\nDevelop control strategies and setpoint frameworks for TCS (Thermal Control System) loops supporting direct liquid cooling at densities of 220–380 kW per rack.\n\nEvaluate and qualify CDU vendors on controls integration capabilities, telemetry exposure, and remote management interfaces.\n\nDefine and enforce operational procedures for CDU commissioning, setpoint changes, loop pressure management, and fluid quality monitoring.\n\nSupport design and construction coordination for liquid cooling infrastructure in new data center buildouts, ensuring BMS and controls readiness at Day 1.\n\nOperational Reliability & Incident Response\n\nEstablish and maintain facility event management processes, including on-call response protocols for facility telemetry anomalies.\n\nLead root cause analysis for facility system failures and implement corrective actions to prevent recurrence.\n\nPartner with the data center operations team to maintain and refine emergency response runbooks tied to BMS alerts and automated controls.\n\nDrive continuous improvement in MTTR for facility-related events through better telemetry coverage and automated remediation.\n\nVendor & Stakeholder Management\n\nManage BMS integrators, DCIM vendors, and control subcontractors — from RFP through design, installation, commissioning, and ongoing support.\n\nServe as the primary technical interface with colocation providers on all BMS/EPMS integration topics.\n\nCollaborate with Lambda's infrastructure engineering, construction, and procurement teams to align controls requirements with facility buildout timelines.\n\nSupport due diligence and technical evaluation for new colocation sites and modular data center deployments from a telemetry and controls readiness perspective.\n\nYou\nRequired Experience\n\n7+ years of experience in data center infrastructure engineering, with at least 4 years focused on BMS, DCIM, or controls systems in a hyperscale, colocation, or AI/HPC environment.\n\nHands‑on experience designing and integrating BMS for mission‑critical facilities including UPS, PDU, CRAH/CRAC, chiller plant, cooling tower, and liquid cooling (CDU/in‑row) systems.\n\nStrong working knowledge of industrial control protocols: BACnet IP/MS‑TP, Modbus TCP/RTU, SNMP, DNP3, and modern API‑based integrations.\n\nDemonstrated experience with DCIM platforms (Nlyte, Sunbird, Vertiv TRELLIS, or equivalent) including deployment, configuration, and ongoing administration.\n\nExperience with real‑time telemetry stacks (Prometheus, InfluxDB, Grafana, or similar) applied to infrastructure monitoring use cases.\n\nStrong understanding of data center power and cooling systems, including PUE optimization, thermal management, and redundancy architectures (2N, N+1).\n\nPreferred Qualifications\n\nDirect experience with direct liquid cooling (DLC) systems, CDU controls integration, and TCS loop management for high‑density AI GPU deployments (100+ kW per rack).\n\nFamiliarity with OCP (Open Compute Project) hardware and telemetry standards.\n\nExperience working with major colocation providers (Equinix, Digital Realty, CyrusOne, etc.) on BMS/EPMS integration and data sharing agreements.\n\nExposure to modular or edge data center deployments and associated controls considerations.\n\nBackground in scripting and automation (Python, Ansible, Terraform) applied to infrastructure management workflows.\n\nExperience operating data centers at international scale, including Asia‑Pacific or Southeast Asian markets.\n\nRelevant certifications: CDCP, CDCE, ETA Data Center Specialist, or vendor‑specific BMS/controls certifications.\n\nWhat We Offer\n\nOpportunity to shape the telemetry and controls architecture for one of the fastest‑growing AI infrastructure platforms in the industry.\n\nWork with cutting‑edge GPU infrastructure at rack densities at the frontier of what the industry has deployed.\n\nCollaborative environment with experienced infrastructure, construction, and vendor teams across a rapidly scaling global portfolio.\n\nCompetitive compensation including salary, equity, and comprehensive benefits.\n\nFlexibility in work location with hybrid/remote options depending on facility portfolio needs.\n\nSalary Range Information\nThe annual salary range for this position has been set based on market data and other factors. However, a salary higher or lower than this range may be appropriate for a candidate whose qualifications differ meaningfully from those listed in the job description.\n\nEqual Opportunity Employer\nLambda is an Equal Opportunity employer. Applicants are considered without regard to race, color, religion, creed, national origin, age, sex, gender, marital status, sexual orientation and identity, genetic information, veteran status, citizenship, or any other factors prohibited by local, state, or federal law.\n\n#J-18808-Ljbffr","datePosted":"2026-06-26T05:53:58.817Z","dateModified":"2026-06-26T05:53:58.817Z","hiringOrganization":{"@type":"Organization","name":"Lambda","sameAs":"https://jobsearcher.com"},"jobLocation":{"@type":"Place","address":{"@type":"PostalAddress","addressLocality":"San Jose","addressRegion":"CA","addressCountry":"US"}},"identifier":{"@type":"PropertyValue","name":"JobSearcher","value":"58f31278fee486f447621741"},"url":"https://jobsearcher.com/jobs/58f31278fee486f447621741"}}