JOBSEARCHER

Robot System QA

LocationPalo AltoEmployment TypeFull timeDepartmentSoftwareOverviewApplicationAt Rhoda AI, we're building the full-stack foundation for the next generation of humanoid robots — from high-performance, software-defined hardware to the foundational models and video world models that control it. Our robots are designed to be generalists capable of operating in complex, real-world environments and handling scenarios unseen in training. We work at the intersection of large-scale learning, robotics, and systems, with a research team that includes researchers from Stanford, Berkeley, Harvard, and beyond. We're not building a feature; we're building a new computing platform for physical work — and with over $400M raised, we're investing aggressively in the R&D, hardware development, and manufacturing scale-up to make that a reality.We're looking for a Robot Systems QA Engineer to own the quality and reliability bar across our entire robotics platform — not just the robot itself, but the full system it operates within. You'll design and execute the validation frameworks, benchmarking pipelines, and reliability testing programs that determine whether our systems are truly ready for real-world deployment.What You'll DoDesign and own end-to-end system validation frameworks that span the full platform — robot hardware and software, cloud infrastructure, networking and communication systems, data pipelines, and operational toolingDefine and track reliability KPIs and deployment readiness criteria across all system components — establishing clear, measurable thresholds that gate production releasesBuild and maintain benchmarking pipelines that systematically evaluate system performance across key dimensions: uptime, latency, throughput, fault recovery, and end-to-end task success ratesDesign and execute stress testing, failure mode analysis, and fault injection programs to identify reliability risks before they surface in deploymentInvestigate and root-cause system-level failures — spanning software, hardware, networking, and infrastructure boundaries — and drive corrective actions and regression tests to prevent recurrenceCollaborate closely with robotics, infrastructure, and ML teams to embed quality and testability into system design from the ground upBuild observability and reporting infrastructure that gives the team continuous, clear signal on system health and release readinessWhat We're Looking For4+ years of experience in systems QA, reliability engineering, or a closely related fieldStrong systems thinking — ability to reason about reliability and failure modes across complex, multi-component systemsExperience defining and tracking reliability KPIs, SLOs, and deployment readiness criteria for production systemsHands-on experience designing and executing benchmarking, stress testing, and failure mode analysis programsStrong debugging and root-cause analysis skills across software, hardware, and system boundariesProficiency in Python and/or C++ for test automation and toolingExperience with CI/CD systems and automated test infrastructureEffective communication skills — able to synthesize system health signal and communicate release readiness clearly across engineering and leadershipNice To Have (But Not Required)Experience with hardware-in-the-loop testing or validation of physical systemsFamiliarity with robotics software stacks, perception systems, or control systemsBackground in reliability engineering, FMEA, or safety-critical systems validationExperience with observability and monitoring infrastructure (e.g., Prometheus, Grafana, or similar)Knowledge of industrial communication protocols (EtherCAT, Modbus, gRPC) and networking fundamentalsFamiliarity with cloud infrastructure and distributed systems reliabilityWhy This RoleOwn the quality and reliability bar for one of the most technically ambitious robotics platforms in the world — your validation frameworks directly determine whether our systems are ready to operate in the real worldBuild foundational QA and benchmarking systems from the ground up at a critical moment in the company's development, with direct influence over how and when we deployWork at the intersection of software, hardware, and infrastructure reliability — where a missed failure mode isn't just a bug, it has real consequences in the physical worldAt Rhoda AI, we're building the full-stack foundation for the next generation of humanoid robots — from high-performance, software-defined hardware to the foundational models and video world models that control it. Our robots are designed to be generalists capable of operating in complex, real-world environments and handling scenarios unseen in training. We work at the intersection of large-scale learning, robotics, and systems, with a research team that includes researchers from Stanford, Berkeley, Harvard, and beyond. We're not building a feature; we're building a new computing platform for physical work — and with over $400M raised, we're investing aggressively in the R&D, hardware development, and manufacturing scale-up to make that a reality.We're looking for a Robot Systems QA Engineer to own the quality and reliability bar across our entire robotics platform — not just the robot itself, but the full system it operates within. You'll design and execute the validation frameworks, benchmarking pipelines, and reliability testing programs that determine whether our systems are truly ready for real-world deployment.What You'll DoDesign and own end-to-end system validation frameworks that span the full platform — robot hardware and software, cloud infrastructure, networking and communication systems, data pipelines, and operational toolingDefine and track reliability KPIs and deployment readiness criteria across all system components — establishing clear, measurable thresholds that gate production releasesBuild and maintain benchmarking pipelines that systematically evaluate system performance across key dimensions: uptime, latency, throughput, fault recovery, and end-to-end task success ratesDesign and execute stress testing, failure mode analysis, and fault injection programs to identify reliability risks before they surface in deploymentInvestigate and root-cause system-level failures — spanning software, hardware, networking, and infrastructure boundaries — and drive corrective actions and regression tests to prevent recurrenceCollaborate closely with robotics, infrastructure, and ML teams to embed quality and testability into system design from the ground upBuild observability and reporting infrastructure that gives the team continuous, clear signal on system health and release readinessWhat We're Looking For4+ years of experience in systems QA, reliability engineering, or a closely related fieldStrong systems thinking — ability to reason about reliability and failure modes across complex, multi-component systemsExperience defining and tracking reliability KPIs, SLOs, and deployment readiness criteria for production systemsHands-on experience designing and executing benchmarking, stress testing, and failure mode analysis programsStrong debugging and root-cause analysis skills across software, hardware, and system boundariesProficiency in Python and/or C++ for test automation and toolingExperience with CI/CD systems and automated test infrastructureEffective communication skills — able to synthesize system health signal and communicate release readiness clearly across engineering and leadershipNice To Have (But Not Required)Experience with hardware-in-the-loop testing or validation of physical systemsFamiliarity with robotics software stacks, perception systems, or control systemsBackground in reliability engineering, FMEA, or safety-critical systems validationExperience with observability and monitoring infrastructure (e.g., Prometheus, Grafana, or similar)Knowledge of industrial communication protocols (EtherCAT, Modbus, gRPC) and networking fundamentalsFamiliarity with cloud infrastructure and distributed systems reliabilityWhy This RoleOwn the quality and reliability bar for one of the most technically ambitious robotics platforms in the world — your validation frameworks directly determine whether our systems are ready to operate in the real worldBuild foundational QA and benchmarking systems from the ground up at a critical moment in the company's development, with direct influence over how and when we deployWork at the intersection of software, hardware, and infrastructure reliability — where a missed failure mode isn't just a bug, it has real consequences in the physical worldAt Rhoda AI, we're building the full-stack foundation for the next generation of humanoid robots — from high-performance, software-defined hardware to the foundational models and video world models that control it. Our robots are designed to be generalists capable of operating in complex, real-world environments and handling scenarios unseen in training. We work at the intersection of large-scale learning, robotics, and systems, with a research team that includes researchers from Stanford, Berkeley, Harvard, and beyond. We're not building a feature; we're building a new computing platform for physical work — and with over $400M raised, we're investing aggressively in the R&D, hardware development, and manufacturing scale-up to make that a reality.We're looking for a Robot Systems QA Engineer to own the quality and reliability bar across our entire robotics platform — not just the robot itself, but the full system it operates within. You'll design and execute the validation frameworks, benchmarking pipelines, and reliability testing programs that determine whether our systems are truly ready for real-world deployment.What You'll DoDesign and own end-to-end system validation frameworks that span the full platform — robot hardware and software, cloud infrastructure, networking and communication systems, data pipelines, and operational toolingDefine and track reliability KPIs and deployment readiness criteria across all system components — establishing clear, measurable thresholds that gate production releasesBuild and maintain benchmarking pipelines that systematically evaluate system performance across key dimensions: uptime, latency, throughput, fault recovery, and end-to-end task success ratesDesign and execute stress testing, failure mode analysis, and fault injection programs to identify reliability risks before they surface in deploymentInvestigate and root-cause system-level failures — spanning software, hardware, networking, and infrastructure boundaries — and drive corrective actions and regression tests to prevent recurrenceCollaborate closely with robotics, infrastructure, and ML teams to embed quality and testability into system design from the ground upBuild observability and reporting infrastructure that gives the team continuous, clear signal on system health and release readinessWhat We're Looking For4+ years of experience in systems QA, reliability engineering, or a closely related fieldStrong systems thinking — ability to reason about reliability and failure modes across complex, multi-component systemsExperience defining and tracking reliability KPIs, SLOs, and deployment readiness criteria for production systemsHands-on experience designing and executing benchmarking, stress testing, and failure mode analysis programsStrong debugging and root-cause analysis skills across software, hardware, and system boundariesProficiency in Python and/or C++ for test automation and toolingExperience with CI/CD systems and automated test infrastructureEffective communication skills — able to synthesize system health signal and communicate release readiness clearly across engineering and leadershipNice To Have (But Not Required)Experience with hardware-in-the-loop testing or validation of physical systemsFamiliarity with robotics software stacks, perception systems, or control systemsBackground in reliability engineering, FMEA, or safety-critical systems validationExperience with observability and monitoring infrastructure (e.g., Prometheus, Grafana, or similar)Knowledge of industrial communication protocols (EtherCAT, Modbus, gRPC) and networking fundamentalsFamiliarity with cloud infrastructure and distributed systems reliabilityWhy This RoleOwn the quality and reliability bar for one of the most technically ambitious robotics platforms in the world — your validation frameworks directly determine whether our systems are ready to operate in the real worldBuild foundational QA and benchmarking systems from the ground up at a critical moment in the company's development, with direct influence over how and when we deployWork at the intersection of software, hardware, and infrastructure reliability — where a missed failure mode isn't just a bug, it has real consequences in the physical world