Modern drug discovery and precision medicine face the persistent challenge of integrating information across scales, from biochemical drug–protein interactions to cellular responses and ultimately patient outcomes. Bridging these levels of biological organization remains a major barrier to translating preclinical findings into therapies. Artificial intelligence (AI) offers powerful tools to model drug–biosystem interactions, but their impact depends on methods that are accurate, interpretable, generalizable, and accessible. This thesis addresses these needs by developing computational frameworks spanning biochemical, cellular, and patient scales. At the biochemical scale, we developed BindSight, a modular framework for drug–target interaction prediction unifying data curation, representation learning, model evaluation, and deployment. It incorporates scaffold-aware splitting, protein promiscuity stratification, and a two-phase prediction scheme: rapid library-wide screening followed by TabPFN re-scoring to balance efficiency with generalization. Central to BindSight is a CLIP-style architecture embedding proteins and compounds in a shared latent space, supporting heterogeneous molecular and protein representations, and accommodating advanced loss functions with distributed training. At the cellular scale, we introduced CellHit, an interpretable framework that predicts drug responses from transcriptomic profiles of cancer cell lines and extends them to patient tumors. By training on large pharmacogenomic resources (GDSC, PRISM) and aligning them with patient bulk RNA-seq through Celligner, the framework uncovered transcriptional programs underpinning drug sensitivity and recovered known drug–target relationships. Incorporating LLM-curated mechanism-of-action pathways enhanced predictive power. To promote accessibility, CellHit has been released as open-source software and deployed as a publicly available web server. At the patient scale, we applied our models to over 10,000 patient transcriptomes from The Cancer Genome Atlas (TCGA), successfully recovering a majority of approved drug-indication pairs and providing strong in silico validation. Importantly, we bridged the gap from computational hypotheses to experimental confirmation through prospective wet-lab experiments, which validated the novel vulnerabilities predicted by our models in pancreatic and glioblastoma cell lines. In sum, this thesis demonstrates how AI can model drug–biosystem interactions across biochemical, cellular, and patient scales. By combining predictive performance with interpretability, biological grounding, and accessibility, it offers methodological advances, experimentally supported insights, and open resources to accelerate drug discovery and translational medicine. While BindSight is a domain-agnostic tool for drug–target interaction prediction, the subsequent cellular and patient scale work focuses specifically on oncology applications.
Modeling drug-biosystems interactions at multiple scales through AI methods
CARLI, FRANCESCO
2026
Abstract
Modern drug discovery and precision medicine face the persistent challenge of integrating information across scales, from biochemical drug–protein interactions to cellular responses and ultimately patient outcomes. Bridging these levels of biological organization remains a major barrier to translating preclinical findings into therapies. Artificial intelligence (AI) offers powerful tools to model drug–biosystem interactions, but their impact depends on methods that are accurate, interpretable, generalizable, and accessible. This thesis addresses these needs by developing computational frameworks spanning biochemical, cellular, and patient scales. At the biochemical scale, we developed BindSight, a modular framework for drug–target interaction prediction unifying data curation, representation learning, model evaluation, and deployment. It incorporates scaffold-aware splitting, protein promiscuity stratification, and a two-phase prediction scheme: rapid library-wide screening followed by TabPFN re-scoring to balance efficiency with generalization. Central to BindSight is a CLIP-style architecture embedding proteins and compounds in a shared latent space, supporting heterogeneous molecular and protein representations, and accommodating advanced loss functions with distributed training. At the cellular scale, we introduced CellHit, an interpretable framework that predicts drug responses from transcriptomic profiles of cancer cell lines and extends them to patient tumors. By training on large pharmacogenomic resources (GDSC, PRISM) and aligning them with patient bulk RNA-seq through Celligner, the framework uncovered transcriptional programs underpinning drug sensitivity and recovered known drug–target relationships. Incorporating LLM-curated mechanism-of-action pathways enhanced predictive power. To promote accessibility, CellHit has been released as open-source software and deployed as a publicly available web server. At the patient scale, we applied our models to over 10,000 patient transcriptomes from The Cancer Genome Atlas (TCGA), successfully recovering a majority of approved drug-indication pairs and providing strong in silico validation. Importantly, we bridged the gap from computational hypotheses to experimental confirmation through prospective wet-lab experiments, which validated the novel vulnerabilities predicted by our models in pancreatic and glioblastoma cell lines. In sum, this thesis demonstrates how AI can model drug–biosystem interactions across biochemical, cellular, and patient scales. By combining predictive performance with interpretability, biological grounding, and accessibility, it offers methodological advances, experimentally supported insights, and open resources to accelerate drug discovery and translational medicine. While BindSight is a domain-agnostic tool for drug–target interaction prediction, the subsequent cellular and patient scale work focuses specifically on oncology applications.| File | Dimensione | Formato | |
|---|---|---|---|
|
finale_carli_pdfa.pdf
accesso aperto
Licenza:
Creative Commons
Dimensione
27.07 MB
Formato
Adobe PDF
|
27.07 MB | Adobe PDF | Visualizza/Apri |
I documenti in UNITESI sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.
https://hdl.handle.net/20.500.14242/364133
URN:NBN:IT:UNIPI-364133