Legged robots are increasingly recognized for their potential to revolutionize various sectors by performing tasks that are challenging and hazardous for humans. Unlocking new behaviors is pivotal not only for disaster response but also to expand their role in industries such as logistics and manufacturing. Controlling such robotic systems is a challenging task. The robot should not only be able to define a strategy autonomously, but also to replan it when necessary. Model Predictive Control (MPC) promises to be a good candidate for the application, as the task can be encoded as an Optimal Control Problem (OCP) that is solved at each control loop, with the goal expressed via a cost function and the dynamic feasibility enforced through optimization constraints. However, MPC faces practical limitations. To obtain the desired behavior, the optimization should include high-fidelity models that capture the robot's dynamics, constraints that accurately reflect physical limits, and a prediction horizon long enough to be representative of the task, leading to the definition of large constrained non-linear optimization problems that must be solved in milliseconds to maintain fast re-planning rates. These conflicting requirements force trade-offs due to the computational bottleneck, limiting the deployment of MPC. In this thesis, we try to address such challenges, developing methods to solve large-scale OCPs maximizing hardware utilization, exploiting learning approaches, and modern GPU accelerators. Our goal is to retain rich models, expressive constraints, and long horizons without sacrificing the real-time performance that MPC demands. As a first step, we developed a whole-body MPC on legged robots exploiting distributed optimization to speed up the computational time. In this first work, we decompose the robot's whole-body dynamics into smaller, parallelizable subsystems and enforce consistency via the Alternating Direction Method of Multipliers (ADMM). Each subsystem solves its own OCP while a consensus sweep maintains coherence across the full robot. This reduces computation time and enables efficient scaling to more complex morphologies; for example, adding articulated arms to a quadruped does not increase computational time despite the additional degrees of freedom. Numerical evaluations show convergence to solutions consistent with state-of-the-art centralized methods, with up to a 75\% reduction in computation time. This demonstrates a practical path to exploiting modern multi-core hardware for faster whole-body MPC on hardware. In order to achieve fast replanning rates, we had to use predefined contact timings, since contact-sequence generation is a challenging task for optimization-based controllers due to both the combinatorial nature of contact selection and the numerical difficulties introduced by contact constraints (e.g., complementarity). We tackled this problem in a second work, casting contact planning as a sampling-based search using Monte Carlo Tree Search (MCTS) over feasible contact transitions, and bootstrapping the search with a learned value function to guide exploration. This hybrid approach preserves model-based guarantees for out-of-distribution scenarios while leveraging learning for speed. The resulting planner produces contact sequences and timings in real time, enabling reactive, non-gaited locomotion on hardware. While the first work enables CPU parallelization, its benefits are limited by the relatively small number of available threads. By contrast, GPUs offer massive parallelism and the possibility for substantial gains. To harness accelerators' potentials, we developed a differentiable, GPU-accelerated MPC framework. We introduce a Sequential Quadratic Programming solver that exploits temporal and state-space parallelism, using parallel associative scans to solve the KKT system. This yields an overall complexity of $\mathcal{O}(\log^2(n) \log N + \log^2(m))$ with $n$ states, $m$ control inputs, and horizon length $N$ instead of $\mathcal{O}(N (n + m)^3)$. Compared to the state of the art, we achieve logarithmic scaling in horizon and square-log in state dimension, enabling whole-body MPC with horizons up to 300 nodes and centralized MPC for a fleet of 16 quadrupeds, all computed in under 25 ms. Implemented in JAX, the solver is differentiable and supports large-scale batching across environments, making it feasible to train policies with MPC-in-the-loop directly on the GPU, thereby bridging model-based and learning-based methods. This opens the door to new paradigms in legged locomotion that combine the strengths of both approaches. Overall, this thesis advanced the computational efficiency, scalability, and practicality of MPC-driven legged locomotion through numerical and learning methods.
Scaling Model Predictive Control for High-Dimensional Robotic Systems
AMATUCCI, LORENZO
2026
Abstract
Legged robots are increasingly recognized for their potential to revolutionize various sectors by performing tasks that are challenging and hazardous for humans. Unlocking new behaviors is pivotal not only for disaster response but also to expand their role in industries such as logistics and manufacturing. Controlling such robotic systems is a challenging task. The robot should not only be able to define a strategy autonomously, but also to replan it when necessary. Model Predictive Control (MPC) promises to be a good candidate for the application, as the task can be encoded as an Optimal Control Problem (OCP) that is solved at each control loop, with the goal expressed via a cost function and the dynamic feasibility enforced through optimization constraints. However, MPC faces practical limitations. To obtain the desired behavior, the optimization should include high-fidelity models that capture the robot's dynamics, constraints that accurately reflect physical limits, and a prediction horizon long enough to be representative of the task, leading to the definition of large constrained non-linear optimization problems that must be solved in milliseconds to maintain fast re-planning rates. These conflicting requirements force trade-offs due to the computational bottleneck, limiting the deployment of MPC. In this thesis, we try to address such challenges, developing methods to solve large-scale OCPs maximizing hardware utilization, exploiting learning approaches, and modern GPU accelerators. Our goal is to retain rich models, expressive constraints, and long horizons without sacrificing the real-time performance that MPC demands. As a first step, we developed a whole-body MPC on legged robots exploiting distributed optimization to speed up the computational time. In this first work, we decompose the robot's whole-body dynamics into smaller, parallelizable subsystems and enforce consistency via the Alternating Direction Method of Multipliers (ADMM). Each subsystem solves its own OCP while a consensus sweep maintains coherence across the full robot. This reduces computation time and enables efficient scaling to more complex morphologies; for example, adding articulated arms to a quadruped does not increase computational time despite the additional degrees of freedom. Numerical evaluations show convergence to solutions consistent with state-of-the-art centralized methods, with up to a 75\% reduction in computation time. This demonstrates a practical path to exploiting modern multi-core hardware for faster whole-body MPC on hardware. In order to achieve fast replanning rates, we had to use predefined contact timings, since contact-sequence generation is a challenging task for optimization-based controllers due to both the combinatorial nature of contact selection and the numerical difficulties introduced by contact constraints (e.g., complementarity). We tackled this problem in a second work, casting contact planning as a sampling-based search using Monte Carlo Tree Search (MCTS) over feasible contact transitions, and bootstrapping the search with a learned value function to guide exploration. This hybrid approach preserves model-based guarantees for out-of-distribution scenarios while leveraging learning for speed. The resulting planner produces contact sequences and timings in real time, enabling reactive, non-gaited locomotion on hardware. While the first work enables CPU parallelization, its benefits are limited by the relatively small number of available threads. By contrast, GPUs offer massive parallelism and the possibility for substantial gains. To harness accelerators' potentials, we developed a differentiable, GPU-accelerated MPC framework. We introduce a Sequential Quadratic Programming solver that exploits temporal and state-space parallelism, using parallel associative scans to solve the KKT system. This yields an overall complexity of $\mathcal{O}(\log^2(n) \log N + \log^2(m))$ with $n$ states, $m$ control inputs, and horizon length $N$ instead of $\mathcal{O}(N (n + m)^3)$. Compared to the state of the art, we achieve logarithmic scaling in horizon and square-log in state dimension, enabling whole-body MPC with horizons up to 300 nodes and centralized MPC for a fleet of 16 quadrupeds, all computed in under 25 ms. Implemented in JAX, the solver is differentiable and supports large-scale batching across environments, making it feasible to train policies with MPC-in-the-loop directly on the GPU, thereby bridging model-based and learning-based methods. This opens the door to new paradigms in legged locomotion that combine the strengths of both approaches. Overall, this thesis advanced the computational efficiency, scalability, and practicality of MPC-driven legged locomotion through numerical and learning methods.| File | Dimensione | Formato | |
|---|---|---|---|
|
phdunige_ 5502763.pdf
accesso aperto
Licenza:
Tutti i diritti riservati
Dimensione
16.19 MB
Formato
Adobe PDF
|
16.19 MB | Adobe PDF | Visualizza/Apri |
I documenti in UNITESI sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.
https://hdl.handle.net/20.500.14242/359869
URN:NBN:IT:UNIGE-359869