Scaling Model Predictive Control for High-Dimensional Robotic Systems

Amatucci, Lorenzo

Legged robots are increasingly recognized for their potential to revolutionize various sectors by performing tasks that are challenging and hazardous for humans. Unlocking new behaviors is pivotal not only for disaster response but also to expand their role in industries such as logistics and manufacturing. Controlling such robotic systems is a challenging task. The robot should not only be able to define a strategy autonomously, but also to replan it when necessary. Model Predictive Control (MPC) promises to be a good candidate for the application, as the task can be encoded as an Optimal Control Problem (OCP) that is solved at each control loop, with the goal expressed via a cost function and the dynamic feasibility enforced through optimization constraints. However, MPC faces practical limitations. To obtain the desired behavior, the optimization should include high-fidelity models that capture the robot's dynamics, constraints that accurately reflect physical limits, and a prediction horizon long enough to be representative of the task, leading to the definition of large constrained non-linear optimization problems that must be solved in milliseconds to maintain fast re-planning rates. These conflicting requirements force trade-offs due to the computational bottleneck, limiting the deployment of MPC. In this thesis, we try to address such challenges, developing methods to solve large-scale OCPs maximizing hardware utilization, exploiting learning approaches, and modern GPU accelerators. Our goal is to retain rich models, expressive constraints, and long horizons without sacrificing the real-time performance that MPC demands. As a first step, we developed a whole-body MPC on legged robots exploiting distributed optimization to speed up the computational time. In this first work, we decompose the robot's whole-body dynamics into smaller, parallelizable subsystems and enforce consistency via the Alternating Direction Method of Multipliers (ADMM). Each subsystem solves its own OCP while a consensus sweep maintains coherence across the full robot. This reduces computation time and enables efficient scaling to more complex morphologies; for example, adding articulated arms to a quadruped does not increase computational time despite the additional degrees of freedom. Numerical evaluations show convergence to solutions consistent with state-of-the-art centralized methods, with up to a 75\% reduction in computation time. This demonstrates a practical path to exploiting modern multi-core hardware for faster whole-body MPC on hardware. In order to achieve fast replanning rates, we had to use predefined contact timings, since contact-sequence generation is a challenging task for optimization-based controllers due to both the combinatorial nature of contact selection and the numerical difficulties introduced by contact constraints (e.g., complementarity). We tackled this problem in a second work, casting contact planning as a sampling-based search using Monte Carlo Tree Search (MCTS) over feasible contact transitions, and bootstrapping the search with a learned value function to guide exploration. This hybrid approach preserves model-based guarantees for out-of-distribution scenarios while leveraging learning for speed. The resulting planner produces contact sequences and timings in real time, enabling reactive, non-gaited locomotion on hardware. While the first work enables CPU parallelization, its benefits are limited by the relatively small number of available threads. By contrast, GPUs offer massive parallelism and the possibility for substantial gains. To harness accelerators' potentials, we developed a differentiable, GPU-accelerated MPC framework. We introduce a Sequential Quadratic Programming solver that exploits temporal and state-space parallelism, using parallel associative scans to solve the KKT system. This yields an overall complexity of $\mathcal{O}(\log^2(n) \log N + \log^2(m))$ with $n$ states, $m$ control inputs, and horizon length $N$ instead of $\mathcal{O}(N (n + m)^3)$. Compared to the state of the art, we achieve logarithmic scaling in horizon and square-log in state dimension, enabling whole-body MPC with horizons up to 300 nodes and centralized MPC for a fleet of 16 quadrupeds, all computed in under 25 ms. Implemented in JAX, the solver is differentiable and supports large-scale batching across environments, making it feasible to train policies with MPC-in-the-loop directly on the GPU, thereby bridging model-based and learning-based methods. This opens the door to new paradigms in legged locomotion that combine the strengths of both approaches. Overall, this thesis advanced the computational efficiency, scalability, and practicality of MPC-driven legged locomotion through numerical and learning methods.

Scaling Model Predictive Control for High-Dimensional Robotic Systems

AMATUCCI, LORENZO

2026

Abstract

Legged robots are increasingly recognized for their potential to revolutionize various sectors by performing tasks that are challenging and hazardous for humans. Unlocking new behaviors is pivotal not only for disaster response but also to expand their role in industries such as logistics and manufacturing. Controlling such robotic systems is a challenging task. The robot should not only be able to define a strategy autonomously, but also to replan it when necessary. Model Predictive Control (MPC) promises to be a good candidate for the application, as the task can be encoded as an Optimal Control Problem (OCP) that is solved at each control loop, with the goal expressed via a cost function and the dynamic feasibility enforced through optimization constraints. However, MPC faces practical limitations. To obtain the desired behavior, the optimization should include high-fidelity models that capture the robot's dynamics, constraints that accurately reflect physical limits, and a prediction horizon long enough to be representative of the task, leading to the definition of large constrained non-linear optimization problems that must be solved in milliseconds to maintain fast re-planning rates. These conflicting requirements force trade-offs due to the computational bottleneck, limiting the deployment of MPC. In this thesis, we try to address such challenges, developing methods to solve large-scale OCPs maximizing hardware utilization, exploiting learning approaches, and modern GPU accelerators. Our goal is to retain rich models, expressive constraints, and long horizons without sacrificing the real-time performance that MPC demands. As a first step, we developed a whole-body MPC on legged robots exploiting distributed optimization to speed up the computational time. In this first work, we decompose the robot's whole-body dynamics into smaller, parallelizable subsystems and enforce consistency via the Alternating Direction Method of Multipliers (ADMM). Each subsystem solves its own OCP while a consensus sweep maintains coherence across the full robot. This reduces computation time and enables efficient scaling to more complex morphologies; for example, adding articulated arms to a quadruped does not increase computational time despite the additional degrees of freedom. Numerical evaluations show convergence to solutions consistent with state-of-the-art centralized methods, with up to a 75\% reduction in computation time. This demonstrates a practical path to exploiting modern multi-core hardware for faster whole-body MPC on hardware. In order to achieve fast replanning rates, we had to use predefined contact timings, since contact-sequence generation is a challenging task for optimization-based controllers due to both the combinatorial nature of contact selection and the numerical difficulties introduced by contact constraints (e.g., complementarity). We tackled this problem in a second work, casting contact planning as a sampling-based search using Monte Carlo Tree Search (MCTS) over feasible contact transitions, and bootstrapping the search with a learned value function to guide exploration. This hybrid approach preserves model-based guarantees for out-of-distribution scenarios while leveraging learning for speed. The resulting planner produces contact sequences and timings in real time, enabling reactive, non-gaited locomotion on hardware. While the first work enables CPU parallelization, its benefits are limited by the relatively small number of available threads. By contrast, GPUs offer massive parallelism and the possibility for substantial gains. To harness accelerators' potentials, we developed a differentiable, GPU-accelerated MPC framework. We introduce a Sequential Quadratic Programming solver that exploits temporal and state-space parallelism, using parallel associative scans to solve the KKT system. This yields an overall complexity of $\mathcal{O}(\log^2(n) \log N + \log^2(m))$ with $n$ states, $m$ control inputs, and horizon length $N$ instead of $\mathcal{O}(N (n + m)^3)$. Compared to the state of the art, we achieve logarithmic scaling in horizon and square-log in state dimension, enabling whole-body MPC with horizons up to 300 nodes and centralized MPC for a fleet of 16 quadrupeds, all computed in under 25 ms. Implemented in JAX, the solver is differentiable and supports large-scale batching across environments, making it feasible to train policies with MPC-in-the-loop directly on the GPU, thereby bridging model-based and learning-based methods. This opens the door to new paradigms in legged locomotion that combine the strengths of both approaches. Overall, this thesis advanced the computational efficiency, scalability, and practicality of MPC-driven legged locomotion through numerical and learning methods.

Scheda breve

Scheda completa

Scheda completa (DC)

	Facoltà/Dipartimento
	
				100023 - Dipartimento di Informatica, bioingegneria, robotica e ingegneria dei sistemi
			
	Corso di studio
	
				XXXVIII CICLO - BIOINGEGNERIA E ROBOTICA - BIOENGINEERING AND ROBOTICS - BIOENGINEERING
			
	Data di pubblicazione
	
				26-feb-2026
			
	Lingua
	
				Inglese
			
	Relatore, Supervisor, Advisor o Tutor
	
				Claudio Semini, Victor Barasuol, Giulio Turrisi
			
	Correlatore, Controrelatore, Co-Supervisor,  Co-Tutor o Coordinatori
	
				MASSOBRIO, PAOLO
			
	Nome Editore
	
				Università degli studi di Genova
			
	Collezione di appartenenza
	
				Università degli Studi di Genova

File in questo prodotto:

File	Dimensione	Formato
phdunige_ 5502763.pdf accesso aperto Licenza: Tutti i diritti riservati Dimensione 16.19 MB Formato Adobe PDF Visualizza/Apri	16.19 MB	Adobe PDF	Visualizza/Apri

I documenti in UNITESI sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/20.500.14242/359869

Il codice NBN di questa tesi è URN:NBN:IT:UNIGE-359869