We investigated the evolutionary and functional divergence of duplicated genes in vertebrates, focusing on enzyme neofunctionalization. Starting with approximately 2,400 enzymatic proteins sharing identical PFAM domain architectures, we identified orthologous gene pairs across major vertebrate clades using a specialized pipeline. High-quality multiple sequence alignments were analyzed to compute per-residue evolutionary metrics, such as In-group and Differential Conservation Scores, utilizing the BLOSUM62 substitution matrix to pinpoint residues contributing to functional divergence. Context-based metrics from ProtTrans embeddings and functional hotspot predictions via BindEmbed21, refined with AlphaFold models and P2Rank pocket predictions, further elucidated potential functional changes. By aggregating these per-residue scores into per-protein metrics, we systematically assessed functional divergence across gene pairs. Thresholds established using a truth set from the Rhea database revealed that 35% of the analyzed gene pairs exhibited strong evidence of neofunctionalization. Enrichment analyses incorporating tissue-specific expression data and functional annotations provided biological context for the observed divergence patterns. A case study on the skin-expressed AADACL2 gene illustrated our approach. Compared to its paralog AADAC, AADACL2 possesses additional functional pocket residues, suggesting a unique lipase function potentially involved in ceramide processing for cornified envelope formation. Experimental validation through heterologous expression faced challenges in protein solubility, leading us to consider ancestral sequence reconstruction for enhanced protein stability. Our findings advance the understanding of enzyme neofunctionalization in vertebrates and offer a framework for detecting functional divergence in duplicated genes.
Machine learning analysis of enzyme neofunctionalization following gene duplication in vertebrate evolution
Carlo, De Rito;
2025
Abstract
We investigated the evolutionary and functional divergence of duplicated genes in vertebrates, focusing on enzyme neofunctionalization. Starting with approximately 2,400 enzymatic proteins sharing identical PFAM domain architectures, we identified orthologous gene pairs across major vertebrate clades using a specialized pipeline. High-quality multiple sequence alignments were analyzed to compute per-residue evolutionary metrics, such as In-group and Differential Conservation Scores, utilizing the BLOSUM62 substitution matrix to pinpoint residues contributing to functional divergence. Context-based metrics from ProtTrans embeddings and functional hotspot predictions via BindEmbed21, refined with AlphaFold models and P2Rank pocket predictions, further elucidated potential functional changes. By aggregating these per-residue scores into per-protein metrics, we systematically assessed functional divergence across gene pairs. Thresholds established using a truth set from the Rhea database revealed that 35% of the analyzed gene pairs exhibited strong evidence of neofunctionalization. Enrichment analyses incorporating tissue-specific expression data and functional annotations provided biological context for the observed divergence patterns. A case study on the skin-expressed AADACL2 gene illustrated our approach. Compared to its paralog AADAC, AADACL2 possesses additional functional pocket residues, suggesting a unique lipase function potentially involved in ceramide processing for cornified envelope formation. Experimental validation through heterologous expression faced challenges in protein solubility, leading us to consider ancestral sequence reconstruction for enhanced protein stability. Our findings advance the understanding of enzyme neofunctionalization in vertebrates and offer a framework for detecting functional divergence in duplicated genes.| File | Dimensione | Formato | |
|---|---|---|---|
|
Carlo_De_Rito_PhD_Thesis.pdf
Open Access dal 02/03/2026
Licenza:
Tutti i diritti riservati
Dimensione
43.79 MB
Formato
Adobe PDF
|
43.79 MB | Adobe PDF | Visualizza/Apri |
I documenti in UNITESI sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.
https://hdl.handle.net/20.500.14242/213384
URN:NBN:IT:UNIPR-213384