Objective. Surgical safety remains a major challenge: even with the advent of minimally invasive surgery (MIS) and robotic systems, surgeons still rely heavily on their own experience to interpret preoperative images and mentally translate them into the operative field. In many procedures, particularly in laparoscopy, locating critical structures (e.g. vessels, tumours) is difficult, especially in altered anatomy, leading to longer operative times, increased cognitive load, and a higher risk of complications. Surgical augmented reality (AR) aims to alleviate these issues by overlaying preoperative 3D information onto the laparoscopic view, but this requires accurate (Target Registration Error (TRE) < 5-6mm) and robust registration under soft-tissue motion and deformation. This thesis addresses this need by developing and evaluating registration methods for surgical AR in laparoscopy, using only data available in a typical operating room setup (preoperative 3D models + laparoscopic video). It follows a stepwise path: establishing practical rigid baselines in realistic settings, building the foundations (benchmarks, datasets, and reconstruction tools) needed for deformable guidance, and finally proposing a patient-specific non-rigid approach for soft-tissue registration. Approach. We proceed in three parts. Part 1—Traditional rigid methods: We first investigate how far classic rigid methods can go in realistic settings. (i) A comparative study of human–computer interfaces for manual alignment aimed at selecting the most effective user interface and understanding the intrinsic limitations of the approaches. (ii) We then embed this interface into EVA, a surgical navigator combining point-based initialization, manual refinement, and optical tracking, and evaluate it in vivo on a porcine model for ureter identification to show how rigid AR guidance affects task performance. Finally, (iii) we extend EVA to robot-assisted laparoscopy, replacing external tracking with robotic kinematics to explore how existing robotic platforms can maintain rigid overlays without additional hardware. Part 2 — Foundations for learning-based registration: we then turn to deep learning, aiming to understand its limits and to create the necessary resources for deformable methods. (iv) A structured review of surgical AR registration methods clarifies the landscape and gaps. (v) A benchmark of rigid complete-to-partial point cloud methods under controlled deformation and visibility characterises when rigid deep models succeed and when they fail. (vi) Two dataset pipelines, including a curated set of 688 kidney meshes and point clouds with synthetic deformations and partial views, provide training and evaluation data tailored to surgical settings. In parallel, (vii) a single-image NeRF pipeline for intraoperative reconstruction in mostly rigid neurosurgical cases shows how 3D surfaces can be recovered under strict data constraints and used as front-ends for registration. Part 3 — Non-rigid registration: building on these insights and tools, (viii) we finally propose a deep-learning pipeline for non-rigid point cloud registration that predicts dense correspondences with a Transformer-based architecture and applies a physics-based deformation model. Here we change the training paradigm from anatomy-agnostic to patient-specific, training and testing on the same anatomy. Main results. Rigid methods: (i) Manual registration can yield accurate overlays but is time-consuming and operator-dependent; (ii) in the in vivo study, EVA reduced ureteridentification time versus a conventional view, highlighting the practical benefit of the proposed registration pipeline; (iii) the robot-assisted variant maintained accurate overlay in simulation using arm kinematics. Foundations: (v) The rigid point cloud benchmark shows sub-centimetre TRE when deformation is limited and the visible surface is sufficient, with marked degradation under large deformations/low visibility; (vi) the kidney dataset (688 models) and deformation/visibility pipelines enable reproducible training and tests; (vii) the single-image NeRF achieves plausible reconstructions compared with multi-view baselines in mostly rigid scenarios. Non-rigid registration: (viii) On synthetic soft-deformation data with limited visibility, the patient-specific approach improves TRE by > 4× over the best rigid baseline (e.g., 22.36mm→4.82±3.33mm at 5% partiality) and reaches <3mm TRE at higher partialities; under very large deformations with anatomical variability, it outperforms rigid methods but remains above surgical accuracy targets. Significance. The thesis provides practical baselines, shows that well-integrated AR can shorten task time and be usable in surgical settings, and demonstrates that robotic kinematics can sustain rigid overlays without external trackers. It delivers the foundational assets: benchmarks, datasets, and a constrained-reconstruction pipeline, needed for fair comparison and faster iteration. Finally, it introduces a patient-specific non-rigid pathway that substantially narrows the gap to clinically meaningful accuracy in laparoscopy, while clarifying the remaining challenges (large deformations, partial views) and pointing to next steps such as uncertainty-aware overlays and multi-organ constraints.

Registration Methods for Surgical Augmented Reality

NERI, ALBERTO
2026

Abstract

Objective. Surgical safety remains a major challenge: even with the advent of minimally invasive surgery (MIS) and robotic systems, surgeons still rely heavily on their own experience to interpret preoperative images and mentally translate them into the operative field. In many procedures, particularly in laparoscopy, locating critical structures (e.g. vessels, tumours) is difficult, especially in altered anatomy, leading to longer operative times, increased cognitive load, and a higher risk of complications. Surgical augmented reality (AR) aims to alleviate these issues by overlaying preoperative 3D information onto the laparoscopic view, but this requires accurate (Target Registration Error (TRE) < 5-6mm) and robust registration under soft-tissue motion and deformation. This thesis addresses this need by developing and evaluating registration methods for surgical AR in laparoscopy, using only data available in a typical operating room setup (preoperative 3D models + laparoscopic video). It follows a stepwise path: establishing practical rigid baselines in realistic settings, building the foundations (benchmarks, datasets, and reconstruction tools) needed for deformable guidance, and finally proposing a patient-specific non-rigid approach for soft-tissue registration. Approach. We proceed in three parts. Part 1—Traditional rigid methods: We first investigate how far classic rigid methods can go in realistic settings. (i) A comparative study of human–computer interfaces for manual alignment aimed at selecting the most effective user interface and understanding the intrinsic limitations of the approaches. (ii) We then embed this interface into EVA, a surgical navigator combining point-based initialization, manual refinement, and optical tracking, and evaluate it in vivo on a porcine model for ureter identification to show how rigid AR guidance affects task performance. Finally, (iii) we extend EVA to robot-assisted laparoscopy, replacing external tracking with robotic kinematics to explore how existing robotic platforms can maintain rigid overlays without additional hardware. Part 2 — Foundations for learning-based registration: we then turn to deep learning, aiming to understand its limits and to create the necessary resources for deformable methods. (iv) A structured review of surgical AR registration methods clarifies the landscape and gaps. (v) A benchmark of rigid complete-to-partial point cloud methods under controlled deformation and visibility characterises when rigid deep models succeed and when they fail. (vi) Two dataset pipelines, including a curated set of 688 kidney meshes and point clouds with synthetic deformations and partial views, provide training and evaluation data tailored to surgical settings. In parallel, (vii) a single-image NeRF pipeline for intraoperative reconstruction in mostly rigid neurosurgical cases shows how 3D surfaces can be recovered under strict data constraints and used as front-ends for registration. Part 3 — Non-rigid registration: building on these insights and tools, (viii) we finally propose a deep-learning pipeline for non-rigid point cloud registration that predicts dense correspondences with a Transformer-based architecture and applies a physics-based deformation model. Here we change the training paradigm from anatomy-agnostic to patient-specific, training and testing on the same anatomy. Main results. Rigid methods: (i) Manual registration can yield accurate overlays but is time-consuming and operator-dependent; (ii) in the in vivo study, EVA reduced ureteridentification time versus a conventional view, highlighting the practical benefit of the proposed registration pipeline; (iii) the robot-assisted variant maintained accurate overlay in simulation using arm kinematics. Foundations: (v) The rigid point cloud benchmark shows sub-centimetre TRE when deformation is limited and the visible surface is sufficient, with marked degradation under large deformations/low visibility; (vi) the kidney dataset (688 models) and deformation/visibility pipelines enable reproducible training and tests; (vii) the single-image NeRF achieves plausible reconstructions compared with multi-view baselines in mostly rigid scenarios. Non-rigid registration: (viii) On synthetic soft-deformation data with limited visibility, the patient-specific approach improves TRE by > 4× over the best rigid baseline (e.g., 22.36mm→4.82±3.33mm at 5% partiality) and reaches <3mm TRE at higher partialities; under very large deformations with anatomical variability, it outperforms rigid methods but remains above surgical accuracy targets. Significance. The thesis provides practical baselines, shows that well-integrated AR can shorten task time and be usable in surgical settings, and demonstrates that robotic kinematics can sustain rigid overlays without external trackers. It delivers the foundational assets: benchmarks, datasets, and a constrained-reconstruction pipeline, needed for fair comparison and faster iteration. Finally, it introduces a patient-specific non-rigid pathway that substantially narrows the gap to clinically meaningful accuracy in laparoscopy, while clarifying the remaining challenges (large deformations, partial views) and pointing to next steps such as uncertainty-aware overlays and multi-organ constraints.
23-feb-2026
Inglese
Veronica Penza, IIT Leonardo De Mattos, IIT
MASSOBRIO, PAOLO
Università degli studi di Genova
File in questo prodotto:
File Dimensione Formato  
phdunige_5543199.pdf

accesso aperto

Licenza: Tutti i diritti riservati
Dimensione 8.05 MB
Formato Adobe PDF
8.05 MB Adobe PDF Visualizza/Apri

I documenti in UNITESI sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/20.500.14242/359867
Il codice NBN di questa tesi è URN:NBN:IT:UNIGE-359867