Cryptocurrency markets and blockchain-based financial infrastructures generate data at unprecedented scale and granularity, while also introducing novel risks, market microstructures, and fast-evolving regulatory debates. In parallel, machine learning (ML) delivers strong predictive performance but is frequently criticized for limited transparency, an issue that becomes central when model outputs can affect economic, legal, or policy decisions. This thesis develops reproducible, interpretable, and domain methodologies at the intersection of explainable ML and digital-asset economics, organized around three complementary objectives: (i) designing explainable ML pipelines for complex socio-economic phenomena, (ii) characterizing investor heterogeneity and regulatory attitudes in crypto markets using international survey data, and (iii) measuring micro-level token circulation and systematizing decentralized derivatives protocols in DeFi. On the methodological side, the thesis proposes explainability workflows for risk-sensitive classification that preserve predictive quality while enabling credible interpretation at both local and global levels. It introduces X-SPIDE, an explainable pipeline for detecting smart Ponzi contracts on Ethereum, and demonstrates how interpretability-first principles can be transferred to voting-intention prediction using survey data and SHAP diagnostics. It further studies the interaction between class imbalance, class overlap, and oversampling, showing via controlled simulations that oversampling effectiveness depends on data geometry rather than imbalance alone. The thesis also proposes Decision Predicate Graphs, a model-specific global interpretability method for tree ensembles that supports structure-aware inspection when rule-based summaries become impractical. On the empirical side, the thesis leverages international survey evidence to profile crypto users and link beliefs to behaviors. It documents that memecoin holders constitute a distinct segment with recognizable demographic and psychological patterns, and it maps heterogeneous support for regulatory domains to perceived market illegitimacy and to individual exposure to crypto wealth. At the Decentralized Finance (DeFi) protocol level, the thesis introduces a micro-velocity methodology tailored to liquid staking tokens and provides evidence on tokens circulation intensity, concentration of turnover, and a progressive shift toward wstETH consistent with composability. It also systematizes DeFi derivatives through a unified representation of actors, flows, and design principles, operationalized via a tuple-based formalism and a reproducible simulation environment for comparative analysis.

Cryptocurrency markets and blockchain-based financial infrastructures generate data at unprecedented scale and granularity, while also introducing novel risks, market microstructures, and fast-evolving regulatory debates. In parallel, machine learning (ML) delivers strong predictive performance but is frequently criticized for limited transparency, an issue that becomes central when model outputs can affect economic, legal, or policy decisions. This thesis develops reproducible, interpretable, and domain methodologies at the intersection of explainable ML and digital-asset economics, organized around three complementary objectives: (i) designing explainable ML pipelines for complex socio-economic phenomena, (ii) characterizing investor heterogeneity and regulatory attitudes in crypto markets using international survey data, and (iii) measuring micro-level token circulation and systematizing decentralized derivatives protocols in DeFi. On the methodological side, the thesis proposes explainability workflows for risk-sensitive classification that preserve predictive quality while enabling credible interpretation at both local and global levels. It introduces X-SPIDE, an explainable pipeline for detecting smart Ponzi contracts on Ethereum, and demonstrates how interpretability-first principles can be transferred to voting-intention prediction using survey data and SHAP diagnostics. It further studies the interaction between class imbalance, class overlap, and oversampling, showing via controlled simulations that oversampling effectiveness depends on data geometry rather than imbalance alone. The thesis also proposes Decision Predicate Graphs, a model-specific global interpretability method for tree ensembles that supports structure-aware inspection when rule-based summaries become impractical. On the empirical side, the thesis leverages international survey evidence to profile crypto users and link beliefs to behaviors. It documents that memecoin holders constitute a distinct segment with recognizable demographic and psychological patterns, and it maps heterogeneous support for regulatory domains to perceived market illegitimacy and to individual exposure to crypto wealth. At the Decentralized Finance (DeFi) protocol level, the thesis introduces a micro-velocity methodology tailored to liquid staking tokens and provides evidence on tokens circulation intensity, concentration of turnover, and a progressive shift toward wstETH consistent with composability. It also systematizes DeFi derivatives through a unified representation of actors, flows, and design principles, operationalized via a tuple-based formalism and a reproducible simulation environment for comparative analysis.

Machine Learning and Cryptocurrency Markets: Methods and Evidence

PENNELLA, LUCA
2026

Abstract

Cryptocurrency markets and blockchain-based financial infrastructures generate data at unprecedented scale and granularity, while also introducing novel risks, market microstructures, and fast-evolving regulatory debates. In parallel, machine learning (ML) delivers strong predictive performance but is frequently criticized for limited transparency, an issue that becomes central when model outputs can affect economic, legal, or policy decisions. This thesis develops reproducible, interpretable, and domain methodologies at the intersection of explainable ML and digital-asset economics, organized around three complementary objectives: (i) designing explainable ML pipelines for complex socio-economic phenomena, (ii) characterizing investor heterogeneity and regulatory attitudes in crypto markets using international survey data, and (iii) measuring micro-level token circulation and systematizing decentralized derivatives protocols in DeFi. On the methodological side, the thesis proposes explainability workflows for risk-sensitive classification that preserve predictive quality while enabling credible interpretation at both local and global levels. It introduces X-SPIDE, an explainable pipeline for detecting smart Ponzi contracts on Ethereum, and demonstrates how interpretability-first principles can be transferred to voting-intention prediction using survey data and SHAP diagnostics. It further studies the interaction between class imbalance, class overlap, and oversampling, showing via controlled simulations that oversampling effectiveness depends on data geometry rather than imbalance alone. The thesis also proposes Decision Predicate Graphs, a model-specific global interpretability method for tree ensembles that supports structure-aware inspection when rule-based summaries become impractical. On the empirical side, the thesis leverages international survey evidence to profile crypto users and link beliefs to behaviors. It documents that memecoin holders constitute a distinct segment with recognizable demographic and psychological patterns, and it maps heterogeneous support for regulatory domains to perceived market illegitimacy and to individual exposure to crypto wealth. At the Decentralized Finance (DeFi) protocol level, the thesis introduces a micro-velocity methodology tailored to liquid staking tokens and provides evidence on tokens circulation intensity, concentration of turnover, and a progressive shift toward wstETH consistent with composability. It also systematizes DeFi derivatives through a unified representation of actors, flows, and design principles, operationalized via a tuple-based formalism and a reproducible simulation environment for comparative analysis.
24-mar-2026
Inglese
Cryptocurrency markets and blockchain-based financial infrastructures generate data at unprecedented scale and granularity, while also introducing novel risks, market microstructures, and fast-evolving regulatory debates. In parallel, machine learning (ML) delivers strong predictive performance but is frequently criticized for limited transparency, an issue that becomes central when model outputs can affect economic, legal, or policy decisions. This thesis develops reproducible, interpretable, and domain methodologies at the intersection of explainable ML and digital-asset economics, organized around three complementary objectives: (i) designing explainable ML pipelines for complex socio-economic phenomena, (ii) characterizing investor heterogeneity and regulatory attitudes in crypto markets using international survey data, and (iii) measuring micro-level token circulation and systematizing decentralized derivatives protocols in DeFi. On the methodological side, the thesis proposes explainability workflows for risk-sensitive classification that preserve predictive quality while enabling credible interpretation at both local and global levels. It introduces X-SPIDE, an explainable pipeline for detecting smart Ponzi contracts on Ethereum, and demonstrates how interpretability-first principles can be transferred to voting-intention prediction using survey data and SHAP diagnostics. It further studies the interaction between class imbalance, class overlap, and oversampling, showing via controlled simulations that oversampling effectiveness depends on data geometry rather than imbalance alone. The thesis also proposes Decision Predicate Graphs, a model-specific global interpretability method for tree ensembles that supports structure-aware inspection when rule-based summaries become impractical. On the empirical side, the thesis leverages international survey evidence to profile crypto users and link beliefs to behaviors. It documents that memecoin holders constitute a distinct segment with recognizable demographic and psychological patterns, and it maps heterogeneous support for regulatory domains to perceived market illegitimacy and to individual exposure to crypto wealth. At the Decentralized Finance (DeFi) protocol level, the thesis introduces a micro-velocity methodology tailored to liquid staking tokens and provides evidence on tokens circulation intensity, concentration of turnover, and a progressive shift toward wstETH consistent with composability. It also systematizes DeFi derivatives through a unified representation of actors, flows, and design principles, operationalized via a tuple-based formalism and a reproducible simulation environment for comparative analysis.
Machine Learning; Blockchain; Cryptocurrencies; DeFi; Survey Data
BIASIOL FRANCESCO
GALLETTA LETTERIO
TORELLI, Nicola
Università degli Studi di Trieste
File in questo prodotto:
File Dimensione Formato  
PhD_Thesis_ADSAI_Pennella_final.pdf

accesso aperto

Licenza: Tutti i diritti riservati
Dimensione 9.21 MB
Formato Adobe PDF
9.21 MB Adobe PDF Visualizza/Apri
PhD_Thesis_ADSAI_Pennella_final_1.pdf

accesso aperto

Licenza: Tutti i diritti riservati
Dimensione 9.21 MB
Formato Adobe PDF
9.21 MB Adobe PDF Visualizza/Apri
PhD_Thesis_ADSAI_Pennella_final_2.pdf

accesso aperto

Licenza: Tutti i diritti riservati
Dimensione 9.21 MB
Formato Adobe PDF
9.21 MB Adobe PDF Visualizza/Apri

I documenti in UNITESI sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/20.500.14242/363718
Il codice NBN di questa tesi è URN:NBN:IT:UNITS-363718