Applied Methods
~JobsRhodaRobot System QA

Rhoda

Robot System QA

SoftwarePalo AltoFull-TimePosted 1 week ago

About the role

At Rhoda AI, we’re building the next generation of generalist intelligent robots. We own the full robotics stack from high-performance hardware and robot systems to the infrastructure and state-of-the-art foundation world models that control our robots. Our robots are designed to be generalists capable of operating in complex, real-world environments and handling long-tail edge cases, made possible by our cutting edge research and end-to-end system design. We've raised over $400M and are investing aggressively in model research, infrastructure, hardware development, and manufacturing scale-up to make generalist robotics a reality.

We're looking for a Robot Systems QA Engineer to own the quality and reliability bar across our entire robotics platform — not just the robot itself, but the full system it operates within. You'll design and execute the validation frameworks, benchmarking pipelines, and reliability testing programs that determine whether our systems are truly ready for real-world deployment.

What You'll Do

  • Design and own end-to-end system validation frameworks that span the full platform — robot hardware and software, cloud infrastructure, networking and communication systems, data pipelines, and operational tooling

  • Collaborate closely with system engineering and the product team to derive test cases and success criteria from high-level requirements

  • Define and track reliability KPIs and deployment readiness criteria across all system components — establishing clear, measurable thresholds that gate production releases

  • Build and maintain benchmarking pipelines that systematically evaluate system performance across key dimensions: uptime, latency, throughput, fault recovery, and end-to-end task success rates

  • Design and execute stress testing, failure mode analysis, and fault injection programs to identify reliability risks before they surface in deployment

  • Investigate and root-cause system-level failures — spanning software, hardware, networking, and infrastructure boundaries — and drive corrective actions and regression tests to prevent recurrence

  • Collaborate closely with robotics, infrastructure, and ML teams to embed quality and testability into system design from the ground up

What We're Looking For

  • 4+ years of experience in systems QA, reliability engineering, or a closely related field

  • Strong systems thinking — ability to reason about reliability and failure modes across complex, multi-component systems

  • Experience defining and tracking reliability KPIs, SLOs, and deployment readiness criteria for production systems

  • Hands-on experience designing and executing benchmarking, stress testing, and failure mode analysis programs

  • Strong debugging and root-cause analysis skills across software, hardware, and system boundaries

  • Proficiency in Python and/or C++ for test automation and tooling

  • Effective communication skills — able to synthesize system health signal and communicate release readiness clearly across engineering and leadership

Nice to Have (But Not Required)

  • Experience with hardware-in-the-loop testing or validation of physical systems

  • Familiarity with robotics software stacks, perception systems, or control systems

  • Background in reliability engineering, FMEA, or safety-critical systems validation

  • Experience with observability and monitoring infrastructure (e.g., Prometheus, Grafana, or similar)

  • Knowledge of industrial communication protocols (EtherCAT, Modbus, gRPC) and networking fundamentals

  • Familiarity with cloud infrastructure and distributed systems reliability

  • Experience with CI/CD systems and automated test infrastructure

Why This Role

  • Own the quality and reliability bar for one of the most technically ambitious robotics platforms in the world — your validation frameworks directly determine whether our systems are ready to operate in the real world

  • Build foundational QA and benchmarking systems from the ground up at a critical moment in the company's development, with direct influence over how and when we deploy

  • Work at the intersection of software, hardware, and infrastructure reliability — where a missed failure mode isn't just a bug, it has real consequences in the physical world