Graphical models are used to represent conditional independence relationships among variables by the means of a graph, with variables corresponding to graph's nodes. They are widely used in genomic studies, finance, energy forecasting, among other fields. More specifically, for a collection of q variables with conditional independence structure represented by an undirected graph, we assume that the underlying graph's structure is unknown. We are interested in inferring the graph's structure from data at hand. This procedure the bibliography is referred to as Structure Learning, where we use certain techniques for selecting a graphical model to depict conditional independence relationships between these q variables. We start from defining a model space which is consisted by a set of all possible graphical models; then we define a scoring function which enables us to score the different models of the model space and finally, we construct a search algorithm that will navigate through the model space to identify the optimal model that explains the problem at hand. The choice of a scoring function is crucial for optimizing the search procedure through the model space. Our approach to this problem is purely Bayesian for handling uncertainty in a more elaborate fashion. We will use estimates of posterior model probabilities for ranking the models at hand. The specification of a conditional prior on the column covariance matrix is not trivial because each graph under consideration induces a different independence structure and it affects the parameter space. Under this context, we cannot directly use improper priors, since they would result to indeterminate Bayes factors, thus we are required to carefully elicit a prior distribution under each graph, a task that becomes infeasible in higher dimensions. For creating an automated Bayesian scoring technique, we resort to Objective Bayes approaches, which are initiated by an improper prior distribution and their output is a fully usable prior distributions. In this thesis, we propose the use of two alternative Objective Bayes approaches for estimating posterior probabilities of models, namely the Expected Posterior prior approach and the Power-Expected Posterior Prior approach. Both approaches utilize the device of imaginary observations for providing usable prior distributions and are theoretically sounder than the Fractional Bayes Factor of O'Hagan. Our goal is to introduce both the Expected and Power-Expected Posterior prior approaches to the field of structure learning of undirected graphical models and evaluate their performance using certain stochastic search techniques. Diverse simulation scenarios are considered as well as a real-life data application.
Graphical models are used to represent conditional independence relationships among variables by the means of a graph, with variables corresponding to graph's nodes. They are widely used in genomic studies, finance, energy forecasting, among other fields. More specifically, for a collection of q variables with conditional independence structure represented by an undirected graph, we assume that the underlying graph's structure is unknown. We are interested in inferring the graph's structure from data at hand. This procedure the bibliography is referred to as Structure Learning, where we use certain techniques for selecting a graphical model to depict conditional independence relationships between these q variables. We start from defining a model space which is consisted by a set of all possible graphical models; then we define a scoring function which enables us to score the different models of the model space and finally, we construct a search algorithm that will navigate through the model space to identify the optimal model that explains the problem at hand. The choice of a scoring function is crucial for optimizing the search procedure through the model space. Our approach to this problem is purely Bayesian for handling uncertainty in a more elaborate fashion. We will use estimates of posterior model probabilities for ranking the models at hand. The specification of a conditional prior on the column covariance matrix is not trivial because each graph under consideration induces a different independence structure and it affects the parameter space. Under this context, we cannot directly use improper priors, since they would result to indeterminate Bayes factors, thus we are required to carefully elicit a prior distribution under each graph, a task that becomes infeasible in higher dimensions. For creating an automated Bayesian scoring technique, we resort to Objective Bayes approaches, which are initiated by an improper prior distribution and their output is a fully usable prior distributions. In this thesis, we propose the use of two alternative Objective Bayes approaches for estimating posterior probabilities of models, namely the Expected Posterior prior approach and the Power-Expected Posterior Prior approach. Both approaches utilize the device of imaginary observations for providing usable prior distributions and are theoretically sounder than the Fractional Bayes Factor of O'Hagan. Our goal is to introduce both the Expected and Power-Expected Posterior prior approaches to the field of structure learning of undirected graphical models and evaluate their performance using certain stochastic search techniques. Diverse simulation scenarios are considered as well as a real-life data application.
Objective Bayes Structure Learning in Gaussian Graphical Models
PETRAKIS, NIKOLAOS
2020
Abstract
Graphical models are used to represent conditional independence relationships among variables by the means of a graph, with variables corresponding to graph's nodes. They are widely used in genomic studies, finance, energy forecasting, among other fields. More specifically, for a collection of q variables with conditional independence structure represented by an undirected graph, we assume that the underlying graph's structure is unknown. We are interested in inferring the graph's structure from data at hand. This procedure the bibliography is referred to as Structure Learning, where we use certain techniques for selecting a graphical model to depict conditional independence relationships between these q variables. We start from defining a model space which is consisted by a set of all possible graphical models; then we define a scoring function which enables us to score the different models of the model space and finally, we construct a search algorithm that will navigate through the model space to identify the optimal model that explains the problem at hand. The choice of a scoring function is crucial for optimizing the search procedure through the model space. Our approach to this problem is purely Bayesian for handling uncertainty in a more elaborate fashion. We will use estimates of posterior model probabilities for ranking the models at hand. The specification of a conditional prior on the column covariance matrix is not trivial because each graph under consideration induces a different independence structure and it affects the parameter space. Under this context, we cannot directly use improper priors, since they would result to indeterminate Bayes factors, thus we are required to carefully elicit a prior distribution under each graph, a task that becomes infeasible in higher dimensions. For creating an automated Bayesian scoring technique, we resort to Objective Bayes approaches, which are initiated by an improper prior distribution and their output is a fully usable prior distributions. In this thesis, we propose the use of two alternative Objective Bayes approaches for estimating posterior probabilities of models, namely the Expected Posterior prior approach and the Power-Expected Posterior Prior approach. Both approaches utilize the device of imaginary observations for providing usable prior distributions and are theoretically sounder than the Fractional Bayes Factor of O'Hagan. Our goal is to introduce both the Expected and Power-Expected Posterior prior approaches to the field of structure learning of undirected graphical models and evaluate their performance using certain stochastic search techniques. Diverse simulation scenarios are considered as well as a real-life data application.File | Dimensione | Formato | |
---|---|---|---|
phd_unimib_816489.pdf
accesso aperto
Dimensione
6.5 MB
Formato
Adobe PDF
|
6.5 MB | Adobe PDF | Visualizza/Apri |
I documenti in UNITESI sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.
https://hdl.handle.net/20.500.14242/170732
URN:NBN:IT:UNIMIB-170732