The study of plant genomes is vital for understanding agro-biodiversity and improving the sustainability of crop production and nutrition. Efficient and accurate genotyping of hundreds to thousands of individuals is essential for modern genetics, particularly in studies of genotype-phenotype associations, detection of selection and identification of the genetic basis of phenotypic traits. Cost-effective genotyping methods are necessary for increasing sample sizes without significantly raising resource investments. Thanks to the decrease in sequencing costs, Whole Genome Sequencing (WGS) is progressively more utilized as it offers a comprehensive genome coverage. However, high library preparation costs remain a barrier, especially for small genomes or low-coverage studies. Innovative multiplexing technologies are needed to reduce costs, though their efficiency compared to traditional single-plex methods needed further evaluation. We identified Twist and seqWell technologies as innovative multiplex library preparation and assessed their genotyping performances compared to traditional singleplex approaches in samples of different legumes species. Both approaches have proven to be valid solutions for studies on large population, in particular Twist which allows for greater reproducibility across different individuals. However, costs remain high in species with large, repetitive genomes, where large fractions of DNA (usually over 80%) is repetitive. Since the coding regions are of primary interest, sequencing non-coding or repetitive regions leads to the generation of non-essential data with inefficient use and waste of sequencing resources. To enrich sequencing data on meaningful single-copy regions, in this study we tested a novel solution, the CRISPR-Cas9-based method, selectively depleting repetitive elements from a genomic library. We implemented the method on the lentil genome (Lens culinaris), a crop with a large genome (3.7 Gbp) and a high repeat content (~85%), designing a specific custom gRNA panel. We depleted up to 41% of repetitive regions, improving the coverage on single-copy region of 2.6-fold and variant detection 12-fold, when sequencing 25 milion fragmens. Our results showed that CRISPR-Cas9-driven repeat depletion focuses sequencing data on single-copy regions, thus improving high-density and genome-wide genotyping in large and repetitive plant genomes. The Cas9 genotyping method was used to genotype a set of 105 cultivated accessions of Lens culinaris, allowing to perform a preliminary genetic diversity analysis and obtain valuable information about the evolution and adaptation of studied samples. Overall, both the multiplex libraries and the CRISPR-Cas9-based method hold significant potential to enhance our genetic knowledge of plant species that are challenging to analyse due to their large genome sizes and high proportions of repetitive DNA, often requiring substantial financial investment. Ultimately, such technological advancements will foster genotyping applications such as population studies, eQTL analyses, GWAS, and pre-breeding programs.

Evaluation and optimization of innovative sequencing approaches for large-scale genotyping

DE ANTONI, LUCA
2025

Abstract

The study of plant genomes is vital for understanding agro-biodiversity and improving the sustainability of crop production and nutrition. Efficient and accurate genotyping of hundreds to thousands of individuals is essential for modern genetics, particularly in studies of genotype-phenotype associations, detection of selection and identification of the genetic basis of phenotypic traits. Cost-effective genotyping methods are necessary for increasing sample sizes without significantly raising resource investments. Thanks to the decrease in sequencing costs, Whole Genome Sequencing (WGS) is progressively more utilized as it offers a comprehensive genome coverage. However, high library preparation costs remain a barrier, especially for small genomes or low-coverage studies. Innovative multiplexing technologies are needed to reduce costs, though their efficiency compared to traditional single-plex methods needed further evaluation. We identified Twist and seqWell technologies as innovative multiplex library preparation and assessed their genotyping performances compared to traditional singleplex approaches in samples of different legumes species. Both approaches have proven to be valid solutions for studies on large population, in particular Twist which allows for greater reproducibility across different individuals. However, costs remain high in species with large, repetitive genomes, where large fractions of DNA (usually over 80%) is repetitive. Since the coding regions are of primary interest, sequencing non-coding or repetitive regions leads to the generation of non-essential data with inefficient use and waste of sequencing resources. To enrich sequencing data on meaningful single-copy regions, in this study we tested a novel solution, the CRISPR-Cas9-based method, selectively depleting repetitive elements from a genomic library. We implemented the method on the lentil genome (Lens culinaris), a crop with a large genome (3.7 Gbp) and a high repeat content (~85%), designing a specific custom gRNA panel. We depleted up to 41% of repetitive regions, improving the coverage on single-copy region of 2.6-fold and variant detection 12-fold, when sequencing 25 milion fragmens. Our results showed that CRISPR-Cas9-driven repeat depletion focuses sequencing data on single-copy regions, thus improving high-density and genome-wide genotyping in large and repetitive plant genomes. The Cas9 genotyping method was used to genotype a set of 105 cultivated accessions of Lens culinaris, allowing to perform a preliminary genetic diversity analysis and obtain valuable information about the evolution and adaptation of studied samples. Overall, both the multiplex libraries and the CRISPR-Cas9-based method hold significant potential to enhance our genetic knowledge of plant species that are challenging to analyse due to their large genome sizes and high proportions of repetitive DNA, often requiring substantial financial investment. Ultimately, such technological advancements will foster genotyping applications such as population studies, eQTL analyses, GWAS, and pre-breeding programs.
2025
Inglese
98
File in questo prodotto:
File Dimensione Formato  
PhD_Thesis_De Antoni_Luca.pdf

embargo fino al 31/12/2028

Dimensione 6.04 MB
Formato Adobe PDF
6.04 MB Adobe PDF

I documenti in UNITESI sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/20.500.14242/202825
Il codice NBN di questa tesi è URN:NBN:IT:UNIVR-202825