Once aimed to mimic the human brain and existed only as mathematical models in academia, the vast family of Artificial Intelligence methods are well passed that initial goal; models with billions of parameters trained on millions of mostly human-generated data. Such models are present in almost each and every aspect of our lives, from the weather forecasts and social network content to how our banks detect fraudulent transactions connected to our accounts and maps that guide us to a restaurant through unknown streets of a new city. All these advancements, though, have a common limitation: their performance is bounded to the amount of human knowledge we can feed them, i.e. training data that should chase the ever-growing model parameters both in terms of quantity and quality. Unfortunately, such a requirement often comes with a high cost, given that generating machine-friendly data and updating and maintaining them is enormously labour-intensive. One way to overcome this issue is the Human-in-the-Loop (HITL) paradigm: looking at humans not only as a passive part of the system, i.e. provider of inputs and consumer of outputs but as an active part of AI systems that participate in the creation and validation of data, model parameters and model inputs. By doing so, we inject the system with up-to-date human knowledge that otherwise should have arrived through expensive and often outdated training sets. This thesis proposes novel methods for integrating HITL with the eXplainable Artificial Intelligence (XAI) and Labour Market Intelligence (LMI) fields: In part I, We propose and implement a conversational explanation system called \convxai by extending the current state-of-the-art and introducing a new conversation type, i.e. Clarification conversation. Following the HITL paradigm, \convxai differentiates itself from the classic XAI systems that create one-size-fits-all explanations regardless of the user's knowledge level, background and need by providing explanations that fit the user's context and using the information provided by the user. This model is made by anonymous data provided by Digital Attitude S.r.l company. In part II, we provide a model called \taxorefs, which achieves its objective, i.e. taxonomy refinement, by considering domain experts as providers of the input data (taxonomy) and in the same time, as final validators of the model's suggestions. This method was developed by data provided by Tabulaex/Burning Glass Technologies company.
Un tempo mirata ad imitare il cervello umano ed esistente solo come modelli matematici nel mondo accademico, la vasta famiglia dei metodi di Intelligenza Artificiale ha ben superato il suo obiettivo iniziale; ed ora comprende modelli con miliardi di parametri addestrati su milioni di dati per lo più generati dall'uomo. Tali modelli sono presenti in quasi ogni aspetto della nostra vita, dalle previsioni meteorologiche e dai contenuti di social network, da come le nostre banche rilevano transazioni fraudolente collegate ai nostri conti, a mappe che ci guidano verso un ristorante attraverso strade a noi sconosciute di una nuova città. Tutti questi progressi, però, hanno un limite comune: le loro prestazioni sono limitate alla quantità di conoscenza umana che possiamo nutrire loro, ovvero dati di addestramento che dovrebbero crescere di continuo assieme ai parametri del modello sia in termini di quantità che di qualità. Sfortunatamente, tale requisito ha spesso un costo elevato, dato che la generazione di dati comprensibili dalle macchina e aggiornarli e mantenerli è estremamente costoso e laborioso. Un modo per superare questo problema è il paradigma Human-in-the-Loop (HITL): considerare gli esseri umani non solo come una parte passiva del sistema, ovvero fornitori di input e consumatori di output, ma come una parte attiva dei sistemi di IA che partecipano alla creazione e alla validazione di dati, parametri del modello e input del modello. In questo modo, iniettiamo nel sistema conoscenze umane aggiornate che altrimenti sarebbero dovute giungere attraverso set di formazione costosi e spesso obsoleti. Questa tesi propone nuovi metodi per l'integrazione di HITL con i campi di eXplainable Artificial Intelligence (XAI) e Labor Market Intelligence (LMI): Nella parte I, proponiamo e implementiamo un sistema di spiegazione conversazionale chiamato ConvXAI, estendendo lo stato dell'arte attuale e introducendo un nuovo tipo di conversazione, ovvero il Clarification conversation. Seguendo il paradigma HITL, ConvXAI si differenzia dai classici sistemi XAI che creano spiegazioni universali indipendentemente dal livello di conoscenza, background e necessità fornendo spiegazioni che si adattano al contesto dell'utente e utilizzando le informazioni fornite dall'utente. Questo modello è composto da dati anonimi forniti dalla società Digital Attitude S.r.l. Nella parte II, proponiamo un modello chiamato TaxoRef, che raggiunge il suo obiettivo, ovvero il raffinamento della tassonomia, considerando gli esperti di dominio come fornitori dei dati di input (tassonomia) e, allo stesso tempo, come validatori finali dei suggerimenti del modello. Questo metodo è stato sviluppato a partire dai dati forniti dalla società Tabulaex/Burning Glass Technologies.
Empowering XAI and LMI with Human-in-the-loop
NOBANI, NAVID
2023
Abstract
Once aimed to mimic the human brain and existed only as mathematical models in academia, the vast family of Artificial Intelligence methods are well passed that initial goal; models with billions of parameters trained on millions of mostly human-generated data. Such models are present in almost each and every aspect of our lives, from the weather forecasts and social network content to how our banks detect fraudulent transactions connected to our accounts and maps that guide us to a restaurant through unknown streets of a new city. All these advancements, though, have a common limitation: their performance is bounded to the amount of human knowledge we can feed them, i.e. training data that should chase the ever-growing model parameters both in terms of quantity and quality. Unfortunately, such a requirement often comes with a high cost, given that generating machine-friendly data and updating and maintaining them is enormously labour-intensive. One way to overcome this issue is the Human-in-the-Loop (HITL) paradigm: looking at humans not only as a passive part of the system, i.e. provider of inputs and consumer of outputs but as an active part of AI systems that participate in the creation and validation of data, model parameters and model inputs. By doing so, we inject the system with up-to-date human knowledge that otherwise should have arrived through expensive and often outdated training sets. This thesis proposes novel methods for integrating HITL with the eXplainable Artificial Intelligence (XAI) and Labour Market Intelligence (LMI) fields: In part I, We propose and implement a conversational explanation system called \convxai by extending the current state-of-the-art and introducing a new conversation type, i.e. Clarification conversation. Following the HITL paradigm, \convxai differentiates itself from the classic XAI systems that create one-size-fits-all explanations regardless of the user's knowledge level, background and need by providing explanations that fit the user's context and using the information provided by the user. This model is made by anonymous data provided by Digital Attitude S.r.l company. In part II, we provide a model called \taxorefs, which achieves its objective, i.e. taxonomy refinement, by considering domain experts as providers of the input data (taxonomy) and in the same time, as final validators of the model's suggestions. This method was developed by data provided by Tabulaex/Burning Glass Technologies company.File | Dimensione | Formato | |
---|---|---|---|
phd_unimib_836807.pdf
accesso aperto
Dimensione
9.46 MB
Formato
Adobe PDF
|
9.46 MB | Adobe PDF | Visualizza/Apri |
I documenti in UNITESI sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.
https://hdl.handle.net/20.500.14242/126500
URN:NBN:IT:UNIMIB-126500