Hey friend
You train a humanoid robot in a simulation for hundreds of hours. It learns to walk recover from pushes and even do some light parkour. Then you put that policy on the real robot… and it immediately falls over or walks like a drunk penguin.
This frustrating problem is called the Sim-to-Real Gap. It is one of the challenges in modern humanoid robotics.
In this article we will explore what causes the sim-to-real gap. We will also look at how the field’s trying to solve it. There are two approaches: better physics modeling and clever neural network compensation.
What Is the Sim-to-Real Gap?
The sim-to-real gap refers to the difference in behavior between a policy trained in simulation and its performance on the robot.
Even small differences can cause problems in highly dynamic systems like humanoids. Balance is tricky and contact forces change quickly.
Main Sources of the Sim-to-Real Gap
Here are the problems:
1. Physics Modeling Errors
- Inaccurate friction models ( static vs dynamic friction)
- Simplified contact dynamics (real foot-ground interaction is very complex)
- Motor modeling inaccuracies (torque limits, backlash, heating effects)
- Sensor noise and latency not properly modeled
- Flexible structural deformation (real robots bend slightly under load)
2. Actuator Differences
- Real motors have delays, saturation and non-linear behavior that simulators often simplify.
- Series Elastic Actuators and quasi-direct drive systems behave differently in reality than in simulation.
3. Environmental Variations
- Slight changes in floor friction, temperature or wear on foot soles.
- Unmodeled disturbances like air currents or cable drag.
- Two Main Approaches to Bridge the Gap
- Approach 1: Improve Physics Modeling
- This approach believes we should make the simulator as accurate as possible.
Techniques include:
- System identification (carefully measuring robot parameters)
- High-fidelity contact models
- Adding noise and randomization during training (Domain Randomization)
- Modeling actuator dynamics more accurately
Pros: More interpretable, grounded in physics.
Cons: difficult to get perfect. Some real-world effects are nearly impossible to model
Approach 2: Neural Network Compensation
This is currently the most successful strategy.
Of trying to make simulation perfect we accept that there will always be a gap. We train the network to compensate for it.
Popular techniques:
- Domain Randomization During training randomize physics parameters (friction, mass, motor strength, latency, etc.) so the policy becomes robust to variation.
- Residual Learning, Train a network to output corrections on top of a base controller.
- System Identification + Adaptation Have the policy continuously estimate real-world parameters and adapt online.
- Imitation + RL Fine-tuning First imitate human or high-quality simulation data, then fine-tune with real-world data.
- Diffusion Policies & Generative Models These have shown surprisingly good sim-to-real transfer because they learn distributions rather than single deterministic behaviors.
Many of the successful recent humanoid projects rely heavily on massive domain randomization combined with residual policies or adaptation layers.
Current Best Practice
The effective strategy today is a hybrid:
- Build the physics simulator you can.
- Apply domain randomization during training.
- Use networks to learn compensatory behaviors.
- Perform fine-tuning on the real robot.
- Add safety layers as a fallback.
This combination has enabled robots to go from ” walking” in simulation to “reasonably robust” on real hardware.
My Personal Take
The sim-to-real gap is a humbling reminder that the real world’s complex. No matter how good our simulators get there will always be effects.
I believe the winning approach for the few years will be physics-aware neural networks. These models understand the underlying physics while still having the flexibility to compensate for real-world imperfections through learning.
We have come a way, from the days when people tried to make simulators perfectly accurate. Today the smartest teams accept the gap exists and focus on building policies that’re robust despite the gap.