In the last two decades, a series of experimental and theoretical advances has made it possible to obtain a detailed understanding of the molecular mechanisms underlying the folding process of proteins. With the increasing power of computer technology, as well as with the improvements in force fields, atomistic simulations are also becoming increasingly important because they can generate highly detailed descriptions of the motions of proteins. A supercomputer specifically designed to integrate the Newton's equations of motion of proteins, Anton, has been recently able to break the millisecond time barrier. This achievement has allowed the direct calculation of repeated folding events for several fast-folding proteins and to characterize the molecular mechanisms underlying protein dynamics and function. However these exceptional resources are available only to few research groups in the world and moreover the observation of few event of a specific process is usually not enough to provide a statistically significant picture of the phenomenon. In parallel, it has also been realized that by bringing together experimental measurements and computational methods it is possible to expand the range of problems that can be addressed. For example, by incorporating structural informations as structural restraints in molecular dynamics simulations it is possible to obtain structural models of these transiently populated states, as well as of native and non-native intermediates explored during the folding process. By applying this strategy to structural parameters measured by nuclear magnetic resonance (NMR) spectroscopy, one can determine the atomic-level structures and characterize the dynamics of proteins. In these approaches the experimental information is exploited to create an additional term in the force field that penalizes the deviations from the measured values, thus restraining the sampling of the conformational space to regions close to those observed experimentally. In this thesis we propose an alternative strategy to exploit experimental information in molecular dynamics simulations. In this approach the measured parameters are not used as structural restraints in the simulations, but rather to build collective variables within metadynamics calculations. In metadynamics , the conformational sampling is enhanced by constructing a time-dependent potential that discourages the explorations of regions already visited in terms of specific functions of the atomic coordinates called collective variables. In this work we show that NMR chemical shifts can be used as collective variables to guide the sampling of conformational space in molecular dynamics simulations. Since the method that we discuss here enables the conformational sampling to be enhanced without modifying the force field through the introduction of structural restraints, it allows estimating reliably the statistical weights corresponding to the force field used in the molecular dynamics simulations. In the present implementation we used the bias exchange metadynamics method, an enhanced sampling technique that allows reconstructing the free energy as a simultaneous function of several variables. By using this approach, we have been able to compute the free energy landscape of two different proteins by explicit solvent molecular dynamics simulations. In the application to a well-structured globular protein, the third immunoglobulin-binding domain of streptococcal protein G (GB3), our calculation predicts the native fold as the lowest free energy minimum, identifying also the presence of an on-pathway compact intermediate with non-native topological elements. In addition, we provide a detailed atomistic picture of the structure at the folding barrier, which shares with the native state only a fraction of the secondary structure elements. The further application to the case of the 40-residue form of Amyloid beta, allows us another remarkable achievement: the quantitative description of the free energy landscape for an intrinsically disordered protein. This kind of proteins are indeed characterized by the absence of a well-defined three-dimensional structure under native conditions and are therefore hard to investigate experimentally. We found that the free energy landscape of this peptide has approximately inverted features with respect to normal globular proteins. Indeed, the global minimum consists of highly disordered structures while higher free energy regions correspond to partially folded conformations. These structures are kinetically committed to the disordered state, but they are transiently explored even at room temperature. This makes our findings particularly relevant since this protein is involved in the Alzheimer's disease because it is prone to aggregate in oligomers determined by the interaction of the monomer in extended beta-strand organization, toxic for the cells. Our structural and energetic characterization allows defining a library of possible metastable states which are involved in the aggregation process. These results have been obtained using relatively limited computational resources. The total simulation time required to reconstruct the thermodynamics of GB3 for example is about three orders of magnitude less than the typical timescale of folding of similar proteins, simulated also by Anton. We thus anticipate that the technique introduced in this thesis will allow the determination of the free energy landscapes of wide range of proteins for which NMR chemical shifts are available. Finally, since chemical shifts are the only external information used to guide the folding of the proteins, our methods can be also successfully applied to the challenging purpose of NMR structure determination, as we have demonstrated in a blind prediction test on the last CASD-NMR target.

Characterizing Structure and Free Energy Landscape of Proteins by NMR-guided Metadynamics

Granata, Daniele
2013

Abstract

In the last two decades, a series of experimental and theoretical advances has made it possible to obtain a detailed understanding of the molecular mechanisms underlying the folding process of proteins. With the increasing power of computer technology, as well as with the improvements in force fields, atomistic simulations are also becoming increasingly important because they can generate highly detailed descriptions of the motions of proteins. A supercomputer specifically designed to integrate the Newton's equations of motion of proteins, Anton, has been recently able to break the millisecond time barrier. This achievement has allowed the direct calculation of repeated folding events for several fast-folding proteins and to characterize the molecular mechanisms underlying protein dynamics and function. However these exceptional resources are available only to few research groups in the world and moreover the observation of few event of a specific process is usually not enough to provide a statistically significant picture of the phenomenon. In parallel, it has also been realized that by bringing together experimental measurements and computational methods it is possible to expand the range of problems that can be addressed. For example, by incorporating structural informations as structural restraints in molecular dynamics simulations it is possible to obtain structural models of these transiently populated states, as well as of native and non-native intermediates explored during the folding process. By applying this strategy to structural parameters measured by nuclear magnetic resonance (NMR) spectroscopy, one can determine the atomic-level structures and characterize the dynamics of proteins. In these approaches the experimental information is exploited to create an additional term in the force field that penalizes the deviations from the measured values, thus restraining the sampling of the conformational space to regions close to those observed experimentally. In this thesis we propose an alternative strategy to exploit experimental information in molecular dynamics simulations. In this approach the measured parameters are not used as structural restraints in the simulations, but rather to build collective variables within metadynamics calculations. In metadynamics , the conformational sampling is enhanced by constructing a time-dependent potential that discourages the explorations of regions already visited in terms of specific functions of the atomic coordinates called collective variables. In this work we show that NMR chemical shifts can be used as collective variables to guide the sampling of conformational space in molecular dynamics simulations. Since the method that we discuss here enables the conformational sampling to be enhanced without modifying the force field through the introduction of structural restraints, it allows estimating reliably the statistical weights corresponding to the force field used in the molecular dynamics simulations. In the present implementation we used the bias exchange metadynamics method, an enhanced sampling technique that allows reconstructing the free energy as a simultaneous function of several variables. By using this approach, we have been able to compute the free energy landscape of two different proteins by explicit solvent molecular dynamics simulations. In the application to a well-structured globular protein, the third immunoglobulin-binding domain of streptococcal protein G (GB3), our calculation predicts the native fold as the lowest free energy minimum, identifying also the presence of an on-pathway compact intermediate with non-native topological elements. In addition, we provide a detailed atomistic picture of the structure at the folding barrier, which shares with the native state only a fraction of the secondary structure elements. The further application to the case of the 40-residue form of Amyloid beta, allows us another remarkable achievement: the quantitative description of the free energy landscape for an intrinsically disordered protein. This kind of proteins are indeed characterized by the absence of a well-defined three-dimensional structure under native conditions and are therefore hard to investigate experimentally. We found that the free energy landscape of this peptide has approximately inverted features with respect to normal globular proteins. Indeed, the global minimum consists of highly disordered structures while higher free energy regions correspond to partially folded conformations. These structures are kinetically committed to the disordered state, but they are transiently explored even at room temperature. This makes our findings particularly relevant since this protein is involved in the Alzheimer's disease because it is prone to aggregate in oligomers determined by the interaction of the monomer in extended beta-strand organization, toxic for the cells. Our structural and energetic characterization allows defining a library of possible metastable states which are involved in the aggregation process. These results have been obtained using relatively limited computational resources. The total simulation time required to reconstruct the thermodynamics of GB3 for example is about three orders of magnitude less than the typical timescale of folding of similar proteins, simulated also by Anton. We thus anticipate that the technique introduced in this thesis will allow the determination of the free energy landscapes of wide range of proteins for which NMR chemical shifts are available. Finally, since chemical shifts are the only external information used to guide the folding of the proteins, our methods can be also successfully applied to the challenging purpose of NMR structure determination, as we have demonstrated in a blind prediction test on the last CASD-NMR target.
21-ott-2013
Inglese
Laio, Alessandro
SISSA
Trieste
File in questo prodotto:
File Dimensione Formato  
1963_7194_thesis_DGranata.pdf

accesso aperto

Dimensione 8.63 MB
Formato Adobe PDF
8.63 MB Adobe PDF Visualizza/Apri

I documenti in UNITESI sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/20.500.14242/67192
Il codice NBN di questa tesi è URN:NBN:IT:SISSA-67192