Many industrial and scientific tasks entail recovery of an unknown function through an ensemble of experiments, each corresponding to a different scenario. This task can generally be abstracted as the task of evaluating an expensive, deterministic black-box function over combinatorial parameter spaces that are much larger than the available evaluation budget – the sampling capacity becoming exponentially sparser as dimensionality grows. In many applications, however, the oracle is stochastic: repeated evaluations at the same configuration produce different outcomes due to simulation or measurement noise. In this case the design problem expands from “where to sample” to “where and how many times to replicate,” requiring variance-aware allocation that balances spatial coverage with replication. This thesis presents a unified framework for sample-efficient exploration of parameter spaces when exhaustive evaluation is either impossible (large deterministic domains) or wasteful (moderate stochastic domains that require replication). The proposed algorithms, gradient guided sampler (G2S) and stochastic gradient guided sampler (SG2S) decompose the full space into low-dimensional subspaces, and allocate additional queries through a multi-armed bandit that selects among three complementary sampling strategies: (i) gradient-driven refinement, (ii) error-of-tensor-decomposition targeting, and (iii) a hybrid that follows gradients of reconstruction error. A custom sampler scheme places new points while respecting a fixed evaluation budget. In deterministic settings where the budget is orders of magnitude smaller than the space cardinality, the method concentrates samples on high-variation regions and achieves lower gradient-weighted prediction error than state-of-the-art Bayesian optimization and other competitors on high-dimensional parameter spaces. In stochastic settings, such as budget-constrained political polling, the same mechanism reallocates redundant surveys toward strata with higher posterior variance, reducing prediction error versus classical stratified sampling. The framework is generic—requiring only function evaluations— and scales to tens of parameters through sub-space optimization, while its redundancy-aware design handles noisy oracles without wasting resources. Applications range from large-ensemble epidemiological simulations (PanCommunity) to adaptive survey design, and hyperparameter tuning. Experiments confirm that the algorithm places more informative samples with the same budget, providing a practical tool for domains where each evaluation is costly in compute time, money, or human effort.

Complicacy Guided Parameter Space Sampling for Simulation Ensemble Generation in Dynamic Contexts, and for Knowledge Discovery with Limited Simulation Budgets

REDONDO ANTON, Javier
2025

Abstract

Many industrial and scientific tasks entail recovery of an unknown function through an ensemble of experiments, each corresponding to a different scenario. This task can generally be abstracted as the task of evaluating an expensive, deterministic black-box function over combinatorial parameter spaces that are much larger than the available evaluation budget – the sampling capacity becoming exponentially sparser as dimensionality grows. In many applications, however, the oracle is stochastic: repeated evaluations at the same configuration produce different outcomes due to simulation or measurement noise. In this case the design problem expands from “where to sample” to “where and how many times to replicate,” requiring variance-aware allocation that balances spatial coverage with replication. This thesis presents a unified framework for sample-efficient exploration of parameter spaces when exhaustive evaluation is either impossible (large deterministic domains) or wasteful (moderate stochastic domains that require replication). The proposed algorithms, gradient guided sampler (G2S) and stochastic gradient guided sampler (SG2S) decompose the full space into low-dimensional subspaces, and allocate additional queries through a multi-armed bandit that selects among three complementary sampling strategies: (i) gradient-driven refinement, (ii) error-of-tensor-decomposition targeting, and (iii) a hybrid that follows gradients of reconstruction error. A custom sampler scheme places new points while respecting a fixed evaluation budget. In deterministic settings where the budget is orders of magnitude smaller than the space cardinality, the method concentrates samples on high-variation regions and achieves lower gradient-weighted prediction error than state-of-the-art Bayesian optimization and other competitors on high-dimensional parameter spaces. In stochastic settings, such as budget-constrained political polling, the same mechanism reallocates redundant surveys toward strata with higher posterior variance, reducing prediction error versus classical stratified sampling. The framework is generic—requiring only function evaluations— and scales to tens of parameters through sub-space optimization, while its redundancy-aware design handles noisy oracles without wasting resources. Applications range from large-ensemble epidemiological simulations (PanCommunity) to adaptive survey design, and hyperparameter tuning. Experiments confirm that the algorithm places more informative samples with the same budget, providing a practical tool for domains where each evaluation is costly in compute time, money, or human effort.
20-ott-2025
Inglese
CANDAN, K. SELCUK
SAPINO, Maria Luisa
Università degli Studi di Torino
File in questo prodotto:
File Dimensione Formato  
JRedondo-Thesis-2025.pdf

accesso aperto

Licenza: Tutti i diritti riservati
Dimensione 6.47 MB
Formato Adobe PDF
6.47 MB Adobe PDF Visualizza/Apri

I documenti in UNITESI sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/20.500.14242/308330
Il codice NBN di questa tesi è URN:NBN:IT:UNITO-308330