Reinforcement Learning Engineer - Whole-Body Control
At Mentee Robotics, we are redefining humanoid automation with an AI-first approach - combining perception, reasoning, and dexterous manipulation into fully autonomous systems that continuously learn and adapt.
We are now expanding with a new robotics Engineering center in China, working hand-in-hand with our engineering teams in headquarters. Its mission: to rapidly develop our next-generation full-size humanoid and bring it to life - a walking, working platform that becomes the foundation of our next generation of products. This is a small, senior, hands-on team where speed of iteration is the core value.
We are looking for an RL Engineer to train the whole-body behaviors of our humanoid in simulation, on Isaac Sim with the Newton physics engine. In our architecture there is no classical motion controller above the joint level - the learned policy is the robot's entire behavior layer, coordinating all degrees of freedom and commanding joints directly through the actuator controllers. You are on the critical path to the robot's first steps.
Who you are?
- A deep RL practitioner who has actually transferred policies to physical legged robots - not only benchmarks
- Strong engineer first: your training code is infrastructure, not a notebook
- Comfortable being the owner of the robot's most visible capability
Responsibilities:
- Design, train, and iterate whole-body RL policies in Isaac Sim/Newton: walking, balance recovery, manipulation, and coordinated loco-manipulation behaviors
- Own reward design, curriculum learning, and training methodology; incorporate motion priors (mocap/imitation, AMP-style) for natural movement
- Build and maintain the training infrastructure: massively parallel simulation, experiment tracking, evaluation suites
- Work with the Sim2Real engineer to bake measured actuator constraints and domain randomization into training
- Work with the compute platform engineer to deploy policies on the robot and run on-hardware evaluation
- Progressively expand the policy's capability envelope - from first steps to dynamic, contact-rich whole-body tasks
- Define what the policy observes and commands together with motion control and platform teams - you co-own the robot's core software contract
Requirements:
- M.Sc. or Ph.D. (or equivalent industry experience) in Computer Science, Robotics, or a related field
- 5+ years of hands-on deep RL for robotics with strong PyTorch engineering skills
- Direct experience with Isaac Sim/Newton (or equivalent GPU-parallel simulators) for whole-body RL on legged robots
- Proven sim-to-real transfer of at least one policy to a physical legged robot
- Deep understanding of PPO-family training at scale, reward shaping, and curriculum design
Advantages:
- Familiarity with IsaacLab
- Humanoid (vs. quadruped) whole-body RL experience
- Experience with motion-imitation methods (AMP, DeepMimic-style) and mocap data pipelines
- Publications in top robotics/ML venues (RSS, CoRL, ICRA, NeurIPS) or experience at leading humanoid teams (Unitree, AgiBot, Robotera)
- Experience with teleoperation or demonstration data pipelines for whole-body skills
- Comfortable communicating technical topics in English with international teams
