Human Digital Twins (HDTs) are a revolutionary advancement in human-computer interaction, offering personalised digital representations of individual knowledge, behaviour, and preferences. These sophisticated systems face fundamental challenges in ensuring the reliable organisation and processing of extensive conversational data. This thesis presents an integrated framework that addresses these challenges through advanced topic modelling, natural language processing, and information verification techniques. A key objective is to enhance HDTs' efficiency in capturing and organizing human knowledge from diverse conversational sources. This involves developing novel methods for reducing the dimensionality of conversational data while preserving essential semantic relationships. By improving the scalability of knowledge organisation, HDTs can effectively process large-scale text data without compromising information quality. Recognising the potential for errors in information retrieval, similar to the issue of "hallucinations" in Large Language Models, this thesis also explores robust verification mechanisms. Drawing parallels with detecting misinformation in human-generated content, such as fake news, the research investigates underlying patterns of inaccurate information dissemination. This analysis leads to the development of innovative approaches for ensuring response reliability in HDTs. The experimental validation of the framework encompasses various scales of text analysis, from short-form content like social media posts to long-form articles and technical papers. The results demonstrate the effectiveness of the integrated framework in organising and verifying information while maintaining semantic coherence across different textual contexts. This research contributes to the broader understanding of how HDTs can evolve into reliable partners in human-AI interaction, capable of handling complex information processing tasks while maintaining high standards of accuracy and trustworthiness. The methodologies and insights presented open new avenues for research into proactive HDT systems that can effectively serve as intelligent assistants while preserving the critical balance between capability and reliability.

Building Human Digital Twins through Natural Language Processing: From Conversations to Knowledge

SERRELI, LUIGI
2025

Abstract

Human Digital Twins (HDTs) are a revolutionary advancement in human-computer interaction, offering personalised digital representations of individual knowledge, behaviour, and preferences. These sophisticated systems face fundamental challenges in ensuring the reliable organisation and processing of extensive conversational data. This thesis presents an integrated framework that addresses these challenges through advanced topic modelling, natural language processing, and information verification techniques. A key objective is to enhance HDTs' efficiency in capturing and organizing human knowledge from diverse conversational sources. This involves developing novel methods for reducing the dimensionality of conversational data while preserving essential semantic relationships. By improving the scalability of knowledge organisation, HDTs can effectively process large-scale text data without compromising information quality. Recognising the potential for errors in information retrieval, similar to the issue of "hallucinations" in Large Language Models, this thesis also explores robust verification mechanisms. Drawing parallels with detecting misinformation in human-generated content, such as fake news, the research investigates underlying patterns of inaccurate information dissemination. This analysis leads to the development of innovative approaches for ensuring response reliability in HDTs. The experimental validation of the framework encompasses various scales of text analysis, from short-form content like social media posts to long-form articles and technical papers. The results demonstrate the effectiveness of the integrated framework in organising and verifying information while maintaining semantic coherence across different textual contexts. This research contributes to the broader understanding of how HDTs can evolve into reliable partners in human-AI interaction, capable of handling complex information processing tasks while maintaining high standards of accuracy and trustworthiness. The methodologies and insights presented open new avenues for research into proactive HDT systems that can effectively serve as intelligent assistants while preserving the critical balance between capability and reliability.
21-mar-2025
Inglese
NITTI, MICHELE
Università degli Studi di Cagliari
File in questo prodotto:
File Dimensione Formato  
tesi di dottorato _Luigi Serreli.pdf

accesso aperto

Dimensione 9.42 MB
Formato Adobe PDF
9.42 MB Adobe PDF Visualizza/Apri

I documenti in UNITESI sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/20.500.14242/208583
Il codice NBN di questa tesi è URN:NBN:IT:UNICA-208583