Thoracic cancers represent a leading cause of cancer-related mortality worldwide and encompass both rare malignancies, such as thymic epithelial tumors (TETs), and highly prevalent diseases, including non–small cell lung cancer (NSCLC). Notably, even common cancers comprise multiple molecularly defined subgroups with distinct prognoses and therapeutic strategies, each of which may effectively represent a rare entity. Artificial intelligence (AI) is increasingly being integrated into clinical and translational oncology. In particular, digital pathology and pathomics offer the opportunity to extract clinically relevant information directly from routine diagnostic tissue slides, potentially identifying patterns beyond human visual perception. However, the development and robust validation of AI algorithms require large, heterogeneous, and high-quality datasets. In rare cancers, assembling such cohorts remains a major challenge, limiting both research progress and the implementation of AI-based approaches. In this work, we investigated the potential of digital pathology in two underrepresented entities in thoracic oncology to address clinically meaningful questions. First, leveraging the French RYTHMIC network, we assembled the largest and most heterogeneous collection of TET whole-slide images (WSIs), accounting for several hundreds of cases. We developed and independently validated a weakly-supervised deep learning model for histological subtype classification, achieving a mean AUC-ROC of 0.99 in independent testing. Model interpretability analyses, combining computational histomics features with Shapley value–based attention maps, demonstrated biologically grounded patterns, with regions supporting or opposing class predictions enriched in specific cellular and morphological features. Second, we retrospectively collected an international multicenter cohort of advanced EGFR-mutated NSCLC treated with EGFR tyrosine kinase inhibitors (TKIs), integrating clinical, molecular, and diagnostic hematoxylin-eosin (H&E)-stained WSIs. Despite the inherent limitations of small diagnostic biopsies, we identified necrotic foci as a feature associated with poorer survival outcomes, readily detectable on routine H&E-stained slides. Moreover, preliminary results from a weakly-supervised deep learning model further suggest that, even in limited tissue samples, pathomics may enable the discovery of prognostic and predictive signatures. Indeed, despite our results warranting further external validation, we were able to predict PFS to first line EGFR-TKIs, with encouraging performances. Together, these findings highlight the potential of digital pathology to advance precision oncology in both rare and molecularly defined thoracic cancers, while underscoring the importance of large, high-quality collaborative datasets to achieve robust and reliable findings.

Digital Pathology Applications in Rare and Underrepresented Thoracic Malignancies

ZULLO, LODOVICA
2026

Abstract

Thoracic cancers represent a leading cause of cancer-related mortality worldwide and encompass both rare malignancies, such as thymic epithelial tumors (TETs), and highly prevalent diseases, including non–small cell lung cancer (NSCLC). Notably, even common cancers comprise multiple molecularly defined subgroups with distinct prognoses and therapeutic strategies, each of which may effectively represent a rare entity. Artificial intelligence (AI) is increasingly being integrated into clinical and translational oncology. In particular, digital pathology and pathomics offer the opportunity to extract clinically relevant information directly from routine diagnostic tissue slides, potentially identifying patterns beyond human visual perception. However, the development and robust validation of AI algorithms require large, heterogeneous, and high-quality datasets. In rare cancers, assembling such cohorts remains a major challenge, limiting both research progress and the implementation of AI-based approaches. In this work, we investigated the potential of digital pathology in two underrepresented entities in thoracic oncology to address clinically meaningful questions. First, leveraging the French RYTHMIC network, we assembled the largest and most heterogeneous collection of TET whole-slide images (WSIs), accounting for several hundreds of cases. We developed and independently validated a weakly-supervised deep learning model for histological subtype classification, achieving a mean AUC-ROC of 0.99 in independent testing. Model interpretability analyses, combining computational histomics features with Shapley value–based attention maps, demonstrated biologically grounded patterns, with regions supporting or opposing class predictions enriched in specific cellular and morphological features. Second, we retrospectively collected an international multicenter cohort of advanced EGFR-mutated NSCLC treated with EGFR tyrosine kinase inhibitors (TKIs), integrating clinical, molecular, and diagnostic hematoxylin-eosin (H&E)-stained WSIs. Despite the inherent limitations of small diagnostic biopsies, we identified necrotic foci as a feature associated with poorer survival outcomes, readily detectable on routine H&E-stained slides. Moreover, preliminary results from a weakly-supervised deep learning model further suggest that, even in limited tissue samples, pathomics may enable the discovery of prognostic and predictive signatures. Indeed, despite our results warranting further external validation, we were able to predict PFS to first line EGFR-TKIs, with encouraging performances. Together, these findings highlight the potential of digital pathology to advance precision oncology in both rare and molecularly defined thoracic cancers, while underscoring the importance of large, high-quality collaborative datasets to achieve robust and reliable findings.
2-apr-2026
Inglese
digital pathology; deep learning; EGFR; NSCLC; thymic epithelial tumor; thymoma; thymic carcinoma; oncology; thoracic oncology
GENOVA, CARLO
BOLLINI, SVEVA
Università degli studi di Genova
File in questo prodotto:
File Dimensione Formato  
phdunige_4712787.pdf

embargo fino al 02/04/2027

Licenza: Tutti i diritti riservati
Dimensione 2.87 MB
Formato Adobe PDF
2.87 MB Adobe PDF

I documenti in UNITESI sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/20.500.14242/363433
Il codice NBN di questa tesi è URN:NBN:IT:UNIGE-363433