This thesis presents a comprehensive computational analysis to uncover novel insights into G protein-coupled receptor (GPCR) signaling, particularly focusing on sequence and structural factors driving GPCR-G protein selectivity. Utilizing AI-based bioinformatics tools like protein language models and AlphaFold, we investigated how structural and functional patterns in GPCR sequences forecast their signaling profiles. With ESM1b protein embeddings and experimental data, we developed PRECOGx (https://precogx.bioinfolab.sns.it/) a machine-learning tool predicting receptor-transducer interactions across GPCR classes. A detailed structural analysis of experimental and AlphaFold-multimer-predicted GPCR-G protein interfaces identified critical determinants of coupling specificity, especially in TM5, TM6, and intracellular loops (ICLs). Contact profiling revealed Gs and Gi/o complexes differ by distinct interaction patterns, validated experimentally on CCKAR, where mutations at key positions influenced coupling selectivity. Structural alignment showed that Gs-GPCR complexes exhibit more conserved, stable interfaces, whereas Gi/o complexes display docking variability, underscoring nuanced selectivity mechanisms. Furthermore, AlphaFold-multimer-predicted binary complexes supported findings and expanded structural insights for lesser-characterized complexes like G12/13. We analyzed how GPCR activation transitions correlate with G protein-coupling specificity, revealing unique intra-molecular contact rewiring patterns. Gs-coupled receptors showed increased TM6 connectivity with contacts enriched in the active state, while Gi/o-coupled receptors displayed lower TM6 connectivity, higher TM7 connectivity, and a prevalence of inactive-state contacts. Using Linear Discriminant Analysis, we showed that G protein families and arrestins are distinguished based on contact rewiring features. Applying this knowledge, we designed a GNA13-selective DREADD GPCR through a multistate approach, optimizing sequences for GNA13 binding and minimizing other transducer interactions. Using ProteinMPNN to generate sequence pools, we identified key residues for selective coupling with experimental validation highlighting a promising candidate, Seq108, with selective GNA13 coupling. In summary, this study demonstrates how large-scale GPCR sequence and structural analysis, combined with deep learning, advances our understanding of GPCR signaling enabling modulation of their functions.
Questa tesi presenta un'analisi computazionale per approfondire i meccanismi di segnalazione dei recettori accoppiati a proteine G (GPCR), focalizzandosi sui determinanti di sequenza e struttura che guidano la selettività dell'interazione con le proteine G. Utilizzando strumenti di bioinformatica avanzati come i modelli di linguaggio proteico e AlphaFold (AF), abbiamo esplorato come pattern strutturali e funzionali nei GPCR influenzino i loro profili di segnalazione. Con embedding proteici ESM1b e dati sperimentali, abbiamo sviluppato PRECOGx, un modello di machine learning che predice le interazioni recettore-trasduttore per diverse classi di GPCR. L'analisi strutturale delle interfacce GPCR-proteine G, sia sperimentali che predette da AF-multimer, ha individuato determinanti critici per la specificità di accoppiamento, specialmente nei TM5, TM6 e loop intracellulari. La profilazione dei contatti ha mostrato che i complessi Gs e Gi/o presentano pattern di interazione distinti, validati sperimentalmente su CCKAR. L'allineamento strutturale ha evidenziato interfacce più conservate e stabili nei complessi Gs-GPCR rispetto a quelli Gi/o, che mostrano maggiore variabilità nella geometria di interazione. I complessi predetti da AF-multimer hanno supportato questi risultati, estendendoli a complessi meno caratterizzati come G12/13. Analizzando i cambi conformazionali di attivazione dei GPCR, abbiamo osservato pattern specifici di contatti intra-molecolari: i complessi GPCR-Gs mostrano connettività TM6 tipica dello stato attivo, mentre quelli Gi/o presentano più contatti su TM7, tipici dello stato inattivo. Con l’Analisi Discriminante Lineare, abbiamo differenziato le famiglie di proteine G e arrestine secondo i loro pattern di contatti. Applicando queste conoscenze, abbiamo progettato un GPCR DREADD selettivo per GNA13 attraverso un approccio multistato, ottimizzando sequenze per legare GNA13 riducendo le interazioni con altri trasduttori. Utilizzando ProteinMPNN, abbiamo identificato residui chiave per la selettività di accoppiamento, con una validazione sperimentale che ha individuato una sequenza con accoppiamento selettivo per GNA13.In sintesi, questo studio dimostra come l'analisi di sequenze e strutture GPCR su larga scala, unita a modelli di deep learning, possa avanzare la comprensione della trasduzione del segnale di questa classe di recettori e consentire la modulazione razionale delle loro funzioni.
Dissecting the sequence and structure determinants of GPCR-G protein selectivity via structural bioinformatics and machine learning
MATIC, Marin
2025
Abstract
This thesis presents a comprehensive computational analysis to uncover novel insights into G protein-coupled receptor (GPCR) signaling, particularly focusing on sequence and structural factors driving GPCR-G protein selectivity. Utilizing AI-based bioinformatics tools like protein language models and AlphaFold, we investigated how structural and functional patterns in GPCR sequences forecast their signaling profiles. With ESM1b protein embeddings and experimental data, we developed PRECOGx (https://precogx.bioinfolab.sns.it/) a machine-learning tool predicting receptor-transducer interactions across GPCR classes. A detailed structural analysis of experimental and AlphaFold-multimer-predicted GPCR-G protein interfaces identified critical determinants of coupling specificity, especially in TM5, TM6, and intracellular loops (ICLs). Contact profiling revealed Gs and Gi/o complexes differ by distinct interaction patterns, validated experimentally on CCKAR, where mutations at key positions influenced coupling selectivity. Structural alignment showed that Gs-GPCR complexes exhibit more conserved, stable interfaces, whereas Gi/o complexes display docking variability, underscoring nuanced selectivity mechanisms. Furthermore, AlphaFold-multimer-predicted binary complexes supported findings and expanded structural insights for lesser-characterized complexes like G12/13. We analyzed how GPCR activation transitions correlate with G protein-coupling specificity, revealing unique intra-molecular contact rewiring patterns. Gs-coupled receptors showed increased TM6 connectivity with contacts enriched in the active state, while Gi/o-coupled receptors displayed lower TM6 connectivity, higher TM7 connectivity, and a prevalence of inactive-state contacts. Using Linear Discriminant Analysis, we showed that G protein families and arrestins are distinguished based on contact rewiring features. Applying this knowledge, we designed a GNA13-selective DREADD GPCR through a multistate approach, optimizing sequences for GNA13 binding and minimizing other transducer interactions. Using ProteinMPNN to generate sequence pools, we identified key residues for selective coupling with experimental validation highlighting a promising candidate, Seq108, with selective GNA13 coupling. In summary, this study demonstrates how large-scale GPCR sequence and structural analysis, combined with deep learning, advances our understanding of GPCR signaling enabling modulation of their functions.| File | Dimensione | Formato | |
|---|---|---|---|
|
Tesi.pdf
accesso aperto
Licenza:
Tutti i diritti riservati
Dimensione
14.11 MB
Formato
Adobe PDF
|
14.11 MB | Adobe PDF | Visualizza/Apri |
I documenti in UNITESI sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.
https://hdl.handle.net/20.500.14242/305918
URN:NBN:IT:SNS-305918