Modern robotic systems often exhibit limited flexibility, constraining their range of applications. Consequently, they are restricted to particular tasks, where even minor modifications, such as changes in object dimensions, can require considerable reconfiguration effort. A major bottleneck in achieving adaptable robotic systems is the perception component. This component provides critical information about the objects to manipulate, primarily through vision cameras. Accurate perception is especially crucial in assembly tasks, where robots must precisely localize objects for manipulation. The localization process is traditionally divided into three interdependent subtasks: geometry reconstruction, pose estimation, and grasp prediction. Geometry reconstruction has conventionally relied on depth cameras. However, these cameras are susceptible to numerous limitations, including occlusions, reflections, challenging lighting conditions, and difficulties with small objects—all of which lead to inaccurate or incomplete geometric representations. Neural representations have recently demonstrated exceptional capabilities in Novel View Synthesis (NVS), effectively handling many scenarios where conventional depth cameras struggle. Moreover, unlike traditional methods requiring pre-existing CAD models, neural representations can be generated directly from object images, facilitating the development of more adaptable and flexible robotic systems. While applying NVS techniques to improve robotic manipulation appears promising, their practical integration into functional robotic systems remains challenging and computationally demanding. This thesis establishes a framework that addresses fundamental challenges limiting the application of neural representation techniques to robotics. First, we develop a method that reduces the number of views required for accurate neural object representation by combining traditional view morphing with neural rendering, thereby enhancing practicality for real-world robotic applications. Second, we introduce two complementary 6D pose estimation methods utilizing NVS representations that enable robots to accurately localize objects while preserving the advantages of neural representations. These methods facilitate direct pose validation by comparing query images and synthesized viewpoints. Finally, we enhance grasping precision for anthropomorphic robotic hands by incorporating novel view synthesis during grasp planning, allowing the system to assess potential grasps before execution. Extensive experimental evaluations substantiate the efficacy of our proposed methods across diverse localization and manipulation scenarios, demonstrating their potential to advance robotic perception capabilities.

Neural Representation for robotics

Bortolon, Matteo
2025

Abstract

Modern robotic systems often exhibit limited flexibility, constraining their range of applications. Consequently, they are restricted to particular tasks, where even minor modifications, such as changes in object dimensions, can require considerable reconfiguration effort. A major bottleneck in achieving adaptable robotic systems is the perception component. This component provides critical information about the objects to manipulate, primarily through vision cameras. Accurate perception is especially crucial in assembly tasks, where robots must precisely localize objects for manipulation. The localization process is traditionally divided into three interdependent subtasks: geometry reconstruction, pose estimation, and grasp prediction. Geometry reconstruction has conventionally relied on depth cameras. However, these cameras are susceptible to numerous limitations, including occlusions, reflections, challenging lighting conditions, and difficulties with small objects—all of which lead to inaccurate or incomplete geometric representations. Neural representations have recently demonstrated exceptional capabilities in Novel View Synthesis (NVS), effectively handling many scenarios where conventional depth cameras struggle. Moreover, unlike traditional methods requiring pre-existing CAD models, neural representations can be generated directly from object images, facilitating the development of more adaptable and flexible robotic systems. While applying NVS techniques to improve robotic manipulation appears promising, their practical integration into functional robotic systems remains challenging and computationally demanding. This thesis establishes a framework that addresses fundamental challenges limiting the application of neural representation techniques to robotics. First, we develop a method that reduces the number of views required for accurate neural object representation by combining traditional view morphing with neural rendering, thereby enhancing practicality for real-world robotic applications. Second, we introduce two complementary 6D pose estimation methods utilizing NVS representations that enable robots to accurately localize objects while preserving the advantages of neural representations. These methods facilitate direct pose validation by comparing query images and synthesized viewpoints. Finally, we enhance grasping precision for anthropomorphic robotic hands by incorporating novel view synthesis during grasp planning, allowing the system to assess potential grasps before execution. Extensive experimental evaluations substantiate the efficacy of our proposed methods across diverse localization and manipulation scenarios, demonstrating their potential to advance robotic perception capabilities.
16-ott-2025
Inglese
Poiesi, Fabio
Università degli studi di Trento
TRENTO
125
File in questo prodotto:
File Dimensione Formato  
bortolon_matteo_phd_thesis.pdf

accesso aperto

Licenza: Tutti i diritti riservati
Dimensione 97.11 MB
Formato Adobe PDF
97.11 MB Adobe PDF Visualizza/Apri

I documenti in UNITESI sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/20.500.14242/307039
Il codice NBN di questa tesi è URN:NBN:IT:UNITN-307039