The digital transformation has fundamentally reshaped how organizations operate, leading to an unprecedented flood of data across all sectors. With the global digital transformation market projected to reach 3.4 trillion by 2026, the need to understand and leverage this data has become critical. Network science has emerged as a powerful approach for analyzing complex systems, offering tools to understand interconnected entities through the lens of nodes and edges. Network science has demonstrated its effectiveness across numerous domains. In social science, the field evolved from early sociograms to sophisticated analyses of social media behavior, polarization, and user migration patterns. Financial applications include the analysis of market stability, interbank relationships, and fraud detection through transaction networks. In biology and medicine, network approaches have revolutionized our understanding of protein interactions and disease mechanisms, while ecological applications have enhanced our grasp of species interactions and ecosystem dynamics. While initial research focused primarily on static networks, recent attention has shifted toward temporal networks that incorporate time-varying relationships. This evolution reflects the inherently dynamic nature of most complex systems. For instance, urban mobility networks capture changing traffic patterns throughout the day. Brain networks map the temporal progression of neural impulses. Social networks track the spread of information and trends over time. Financial networks reveal patterns in transaction timing and market responses. Disease transmission networks highlight the temporal aspects of outbreak patterns. Current research has primarily focused on describing network changes over time from a macroscopic perspective, leaving a significant gap in understanding the underlying mechanisms driving these evolutionary patterns. By extracting the mechanisms, named rules, of evolution from a temporal graph, we can reveal the general mesoscopic mechanisms governing network dynamics. This means understanding how mesoscopic network structures evolve over time. This thesis addresses this gap by developing a comprehensive framework, named GERANIO, for modeling, mining, and analyzing network evolution rules. Graph evolution rules (GER) mining is a frequency-based pattern discovery method used to examine the dynamics of networks as they evolve over time. GERs are designed to identify recurring local changes throughout the network's evolution. Drawing inspiration from the notion of association rules in the context of data mining, GERs consist of two essential components: a precondition and a set of postconditions. These rules indicate that a subgraph matching (being isomorphic to) the precondition is likely to evolve into one of the configurations represented by the postconditions, with the corresponding probability. Despite their potential, existing approaches face significant challenges, particularly in the complexity of identifying and analyzing GERs and the difficulties associated with isomorphism checks. The GERANIO framework makes several key contributions to address these challenges. First, it introduces a universal taxonomy for modeling evolving networks, applicable across various domains. Second, it provides a categorization method for isomorphic subgraphs through canonical coding, ensuring universal applicability of results. Third, it proposes ad-hoc null models to extract statistically significant rules. Fourth, it introduces the TULIP algorithm, which mines graph evolution rules from a broader perspective than existing approaches. Finally, it develops the concept of evolutionary profiles – probability distributions of rule frequencies that enable immediate visual feedback on evolving network behavior and network comparison. The framework distinguishes between stand-alone rules and general graph evolution rules. Stand-alone rules excel at identifying specific evolutionary processes, such as cycles in financial networks, while general rules capture the complete evolutionary behavior of a network. Both approaches benefit from the framework's canonical coding method, which provides a unique representation of graphs for easy comparison and analysis. This thesis is structured in four main parts: a comprehensive background on temporal networks modeling and theoretical foundations; an exploration of the GERANIO framework's stand-alone rules component; and an introduction to broader graph evolution rules through the TULIP algorithm. Finally, this thesis explores an evolutionary aspect typical of Web3 networks that cannot be fully captured by graph evolution rules alone: user migration. This phenomenon is particularly relevant in the context of blockchain-based online social networks (BOSNs) and presents unique challenges for network analysis. Through these contributions, this thesis aims to enhance our understanding of network dynamics and provide powerful tools for comparing different networks and analyzing changes over time.
La trasformazione digitale ha rivoluzionato le strategie organizzative, generando un flusso di dati senza precedenti. Con il mercato della trasformazione digitale proiettato a 3,4 trilioni di dollari entro il 2026, sfruttare questi dati è cruciale. La network science offre strumenti per analizzare sistemi complessi modellando nodi e archi. Ha applicazioni in diversi ambiti: nelle scienze sociali analizza social media, polarizzazione e migrazioni; in finanza studia stabilità dei mercati e frodi; in biologia e medicina svela interazioni proteiche e meccanismi patologici; in ecologia esplora dinamiche tra specie ed ecosistemi. Mentre la ricerca inizialmente si concentrava su reti statiche, l'attenzione si è spostata recentemente verso le reti temporali che incorporano relazioni variabili nel tempo. Questa evoluzione riflette la natura intrinsecamente dinamica della maggior parte dei sistemi complessi. Ad esempio, le reti di mobilità urbana catturano i mutevoli modelli di traffico durante il giorno. Le reti sociali tracciano la diffusione di informazioni e tendenze nel tempo mentre quelle finanziarie rivelano modelli nella tempistica delle transazioni e nelle risposte del mercato. Le reti di trasmissione delle malattie evidenziano gli aspetti temporali dei modelli di diffusione. La ricerca attuale si è concentrata principalmente sulla descrizione dei cambiamenti della rete nel tempo da una prospettiva macroscopica, lasciando una lacuna significativa nella comprensione dei meccanismi sottostanti che guidano questi modelli evolutivi. Estraendo i meccanismi evolutivi, altresì detti regole, da un grafo temporale, possiamo rivelare i meccanismi mesoscopici generali che governano le dinamiche della rete. Ciò significa comprendere come le strutture microscopiche e mesoscopiche della rete evolvono nel tempo. Questa tesi affronta questa lacuna della letteratura sviluppando un framework, detto GERANIO, per la modellazione, l'estrazione e l'analisi delle graph evolution rules. L'estrazione delle graph evolution rules (GER) è un metodo di pattern mining basato sulla frequenza di occorrenza di tali pattern. Le GER hanno lo scopo di identificare cambiamenti locali ricorrenti durante l'evoluzione della rete. In modo simile alle association rules nel contesto del data mining, le GER sono composte da due elementi (sottografi): una precondizione e un insieme di postcondizioni. Queste regole indicano che un sottografo che corrisponde (è isomorfo) alla precondizione è probabile che evolva in una delle configurazioni rappresentate dalle postcondizioni, con la probabilità corrispondente. Nonostante il loro potenziale, gli approcci esistenti affrontano sfide significative, in particolare nella complessità dell'identificazione e dell'analisi delle GER e nelle difficoltà associate all'isomorfismo tra grafi. Il framework GERANIO fornisce diversi contributi chiave per affrontare questi problemi. Innanzitutto, introduce una tassonomia universale per la modellazione delle reti evolutive, applicabile a vari domini. In secondo luogo, fornisce un metodo di categorizzazione per sottografi isomorfi attraverso un canonical coding, garantendo l'applicabilità universale dei risultati. In terzo luogo, propone modelli nulli ad hoc per estrarre regole statisticamente significative. In quarto luogo, introduce l'algoritmo TULIP, che estrae regole di evoluzione dei grafi da una prospettiva più ampia rispetto agli approcci esistenti. Infine, sviluppa il concetto di profili evolutivi – distribuzioni di probabilità delle frequenze delle regole che consentono un feedback visivo immediato sul comportamento della rete in evoluzione e sul confronto tra reti. Il framework fa una distinzione tra stand-alone rules e graph evolution rules generali. Le stand-alone rules eccellono nell'identificazione di processi evolutivi specifici, come i cicli nelle reti finanziarie, mentre le regole generali catturano il comportamento evolutivo completo di una rete. Ad entrambi gli approcci é possible applicate il canonica coding del framework, che fornisce una rappresentazione unica dei grafi per un facile confronto e analisi. Questa tesi è strutturata in quattro parti principali: un background completo sulla modellazione delle reti temporali e i concetti teorici; un'esplorazione della componente del framework GERANIO dedicata alle stand-alone rules; e un'introduzione all'algoritmo TULIP che consente di estrarre le graph evolution rules generali. Infine, questa tesi esplora un aspetto evolutivo tipico delle reti Web3 che non può essere completamente catturato dalle sole graph evolution rules: la user migration. Questo fenomeno è particolarmente rilevante nel contesto delle blockchain-based online social networks (BOSN). Attraverso questi contributi, questa tesi ha lo scopo di migliorare la nostra comprensione delle dinamiche delle reti e fornire strumenti per confrontare diverse reti e analizzarne i cambiamenti nel tempo.
A FRAMEWORK FOR NETWORK EVOLUTION
GALDEMAN, ALESSIA
2024
Abstract
The digital transformation has fundamentally reshaped how organizations operate, leading to an unprecedented flood of data across all sectors. With the global digital transformation market projected to reach 3.4 trillion by 2026, the need to understand and leverage this data has become critical. Network science has emerged as a powerful approach for analyzing complex systems, offering tools to understand interconnected entities through the lens of nodes and edges. Network science has demonstrated its effectiveness across numerous domains. In social science, the field evolved from early sociograms to sophisticated analyses of social media behavior, polarization, and user migration patterns. Financial applications include the analysis of market stability, interbank relationships, and fraud detection through transaction networks. In biology and medicine, network approaches have revolutionized our understanding of protein interactions and disease mechanisms, while ecological applications have enhanced our grasp of species interactions and ecosystem dynamics. While initial research focused primarily on static networks, recent attention has shifted toward temporal networks that incorporate time-varying relationships. This evolution reflects the inherently dynamic nature of most complex systems. For instance, urban mobility networks capture changing traffic patterns throughout the day. Brain networks map the temporal progression of neural impulses. Social networks track the spread of information and trends over time. Financial networks reveal patterns in transaction timing and market responses. Disease transmission networks highlight the temporal aspects of outbreak patterns. Current research has primarily focused on describing network changes over time from a macroscopic perspective, leaving a significant gap in understanding the underlying mechanisms driving these evolutionary patterns. By extracting the mechanisms, named rules, of evolution from a temporal graph, we can reveal the general mesoscopic mechanisms governing network dynamics. This means understanding how mesoscopic network structures evolve over time. This thesis addresses this gap by developing a comprehensive framework, named GERANIO, for modeling, mining, and analyzing network evolution rules. Graph evolution rules (GER) mining is a frequency-based pattern discovery method used to examine the dynamics of networks as they evolve over time. GERs are designed to identify recurring local changes throughout the network's evolution. Drawing inspiration from the notion of association rules in the context of data mining, GERs consist of two essential components: a precondition and a set of postconditions. These rules indicate that a subgraph matching (being isomorphic to) the precondition is likely to evolve into one of the configurations represented by the postconditions, with the corresponding probability. Despite their potential, existing approaches face significant challenges, particularly in the complexity of identifying and analyzing GERs and the difficulties associated with isomorphism checks. The GERANIO framework makes several key contributions to address these challenges. First, it introduces a universal taxonomy for modeling evolving networks, applicable across various domains. Second, it provides a categorization method for isomorphic subgraphs through canonical coding, ensuring universal applicability of results. Third, it proposes ad-hoc null models to extract statistically significant rules. Fourth, it introduces the TULIP algorithm, which mines graph evolution rules from a broader perspective than existing approaches. Finally, it develops the concept of evolutionary profiles – probability distributions of rule frequencies that enable immediate visual feedback on evolving network behavior and network comparison. The framework distinguishes between stand-alone rules and general graph evolution rules. Stand-alone rules excel at identifying specific evolutionary processes, such as cycles in financial networks, while general rules capture the complete evolutionary behavior of a network. Both approaches benefit from the framework's canonical coding method, which provides a unique representation of graphs for easy comparison and analysis. This thesis is structured in four main parts: a comprehensive background on temporal networks modeling and theoretical foundations; an exploration of the GERANIO framework's stand-alone rules component; and an introduction to broader graph evolution rules through the TULIP algorithm. Finally, this thesis explores an evolutionary aspect typical of Web3 networks that cannot be fully captured by graph evolution rules alone: user migration. This phenomenon is particularly relevant in the context of blockchain-based online social networks (BOSNs) and presents unique challenges for network analysis. Through these contributions, this thesis aims to enhance our understanding of network dynamics and provide powerful tools for comparing different networks and analyzing changes over time.File | Dimensione | Formato | |
---|---|---|---|
phd_unimi_R13320.pdf
accesso aperto
Dimensione
24.81 MB
Formato
Adobe PDF
|
24.81 MB | Adobe PDF | Visualizza/Apri |
I documenti in UNITESI sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.
https://hdl.handle.net/20.500.14242/184242
URN:NBN:IT:UNIMI-184242