Hey friend
I have been thinking about all the things we have learned far like ZMP, Jacobian, Lagrangian dynamics, trajectory planning and PID control. There is one cool technique that connects a lot of these things together in modern humanoid robots: Model Predictive Control or MPC.
I still remember watching a video of Boston Dynamics Atlas robot. Someone pushed it hard but it did not just stumble around. It seemed to think about what it was going to do and plan its steps to stay upright. This is mostly because of MPC.
Today I want to explain what MPC is, why it is so good for humanoid balance and walking and how it actually works, without using much complicated math.
What Is Model Predictive Control?
Model Predictive Control is much what it sounds like:
- It uses a model of the robot, including physics, dynamics and constraints.
- It tries to predict what will happen in the few seconds.
- It figures out the sequence of actions like how much force to use in each joint or where to put its feet.
- It only does the action then starts the whole process again at the next step.
This “predict, optimize apply repeat” loop happens fast often 100 to 1000 times per second in advanced humanoids.
It is like a chess player who does not just think about the move but thinks about several moves ahead and changes the plan as the game changes.
Why MPC Is Perfect for Humanoids
Walking on two legs is really hard for robots:
- They are not stable. Can easily fall over.
- They have to follow rules like not using too much force in their joints or slipping with their feet.
- They have to react to things that might happen like being pushed or walking on ground.
Traditional PID controllers only look at what’s happening right now. MPC looks ahead. Plans what to do.
This is why robots that use MPC can:
- Walk on uneven ground.
- Recover from being pushed without falling over.
- Carry heavy things while staying balanced.
- Switch smoothly between walking, turning and stopping.
Both Tesla Optimus and Figure 01 use forms of MPC in their walking systems. Boston Dynamics Atlas is famous for its MPC-based control that lets it move around really dynamically.
How MPC Works for Walking
At every step the MPC does the following:
- It measures the state of the robot including the position of its joints how fast it is moving and where its center of mass is.
- It tries to predict what will happen in the 0.5-2 seconds if it does different things.
- It figures out the sequence of actions like where to put its next foot or how much force to use in each joint while following the rules:
Keep the ZMP inside the area where the robot is standing.
- Do not use too much force in the joints.
- Do not slip with the feet.
- Use as little energy as possible.
- It only does the action from the plan.
- It starts the process again at the next step.
This constant replanning is what makes the robot seem like it is thinking ahead.
The Role of the Physics Model
How well MPC works depends a lot on the model it uses. Most humanoid MPC systems use a version of the full Lagrangian dynamics like the Linear Inverted Pendulum Model or a more advanced model.
These simplified models are fast enough to run in time and still capture the important things about balance.
My Personal Take
MPC is one of the reasons modern humanoids are getting a lot better. Ten years ago most robots could only walk on flat floors. Now thanks to MPC and some other techniques they can handle pushes, slopes and uneven surfaces much better.
What I like most about MPC is how it brings together everything we have learned:
- ZMP for stability.
- Lagrangian dynamics for prediction.
- Jacobian for converting between task space.
- Trajectory planning for motion.
- Actuator limits from torque-speed curves.
It is like the conductor of an orchestra making sure all the different parts work together smoothly.
Course MPC is not perfect. It needs a lot of computing power and a good model. If the model is not accurate the predictions will not work. That is why many advanced systems now combine MPC with reinforcement learning using MPC for short-term control and reinforcement learning, for long-term behavior.