About the role
At Rhoda AI, we’re building the next generation of generalist intelligent robots. We own the full robotics stack from high-performance hardware and robot systems to the infrastructure and state-of-the-art foundation world models that control our robots. Our robots are designed to be generalists capable of operating in complex, real-world environments and handling long-tail edge cases, made possible by our cutting edge research and end-to-end system design. We've raised over $400M and are investing aggressively in model research, infrastructure, hardware development, and manufacturing scale-up to make generalist robotics a reality.
We're looking for a Robot Systems QA Engineer to own the quality and reliability bar across our entire robotics platform — not just the robot itself, but the full system it operates within. You'll design and execute the validation frameworks, benchmarking pipelines, and reliability testing programs that determine whether our systems are truly ready for real-world deployment.
What You'll Do
Design and own end-to-end system validation frameworks that span the full platform — robot hardware and software, cloud infrastructure, networking and communication systems, data pipelines, and operational tooling
Collaborate closely with system engineering and the product team to derive test cases and success criteria from high-level requirements
Define and track reliability KPIs and deployment readiness criteria across all system components — establishing clear, measurable thresholds that gate production releases
Build and maintain benchmarking pipelines that systematically evaluate system performance across key dimensions: uptime, latency, throughput, fault recovery, and end-to-end task success rates
Design and execute stress testing, failure mode analysis, and fault injection programs to identify reliability risks before they surface in deployment
Investigate and root-cause system-level failures — spanning software, hardware, networking, and infrastructure boundaries — and drive corrective actions and regression tests to prevent recurrence
Collaborate closely with robotics, infrastructure, and ML teams to embed quality and testability into system design from the ground up
What We're Looking For
4+ years of experience in systems QA, reliability engineering, or a closely related field
Strong systems thinking — ability to reason about reliability and failure modes across complex, multi-component systems
Experience defining and tracking reliability KPIs, SLOs, and deployment readiness criteria for production systems
Hands-on experience designing and executing benchmarking, stress testing, and failure mode analysis programs
Strong debugging and root-cause analysis skills across software, hardware, and system boundaries
Proficiency in Python and/or C++ for test automation and tooling
Effective communication skills — able to synthesize system health signal and communicate release readiness clearly across engineering and leadership
Nice to Have (But Not Required)
Experience with hardware-in-the-loop testing or validation of physical systems
Familiarity with robotics software stacks, perception systems, or control systems
Background in reliability engineering, FMEA, or safety-critical systems validation
Experience with observability and monitoring infrastructure (e.g., Prometheus, Grafana, or similar)
Knowledge of industrial communication protocols (EtherCAT, Modbus, gRPC) and networking fundamentals
Familiarity with cloud infrastructure and distributed systems reliability
Experience with CI/CD systems and automated test infrastructure
Why This Role
Own the quality and reliability bar for one of the most technically ambitious robotics platforms in the world — your validation frameworks directly determine whether our systems are ready to operate in the real world
Build foundational QA and benchmarking systems from the ground up at a critical moment in the company's development, with direct influence over how and when we deploy
Work at the intersection of software, hardware, and infrastructure reliability — where a missed failure mode isn't just a bug, it has real consequences in the physical world
Find similar jobs
Explore opportunities with similar job descriptions at other companies.