Human Digital Twins (HDTs) are a revolutionary advancement in human-computer interaction, offering personalised digital representations of individual knowledge, behaviour, and preferences. These sophisticated systems face fundamental challenges in ensuring the reliable organisation and processing of extensive conversational data. This thesis presents an integrated framework that addresses these challenges through advanced topic modelling, natural language processing, and information verification techniques. A key objective is to enhance HDTs' efficiency in capturing and organizing human knowledge from diverse conversational sources. This involves developing novel methods for reducing the dimensionality of conversational data while preserving essential semantic relationships. By improving the scalability of knowledge organisation, HDTs can effectively process large-scale text data without compromising information quality. Recognising the potential for errors in information retrieval, similar to the issue of "hallucinations" in Large Language Models, this thesis also explores robust verification mechanisms. Drawing parallels with detecting misinformation in human-generated content, such as fake news, the research investigates underlying patterns of inaccurate information dissemination. This analysis leads to the development of innovative approaches for ensuring response reliability in HDTs. The experimental validation of the framework encompasses various scales of text analysis, from short-form content like social media posts to long-form articles and technical papers. The results demonstrate the effectiveness of the integrated framework in organising and verifying information while maintaining semantic coherence across different textual contexts. This research contributes to the broader understanding of how HDTs can evolve into reliable partners in human-AI interaction, capable of handling complex information processing tasks while maintaining high standards of accuracy and trustworthiness. The methodologies and insights presented open new avenues for research into proactive HDT systems that can effectively serve as intelligent assistants while preserving the critical balance between capability and reliability.
Building Human Digital Twins through Natural Language Processing: From Conversations to Knowledge
SERRELI, LUIGI
2025
Abstract
Human Digital Twins (HDTs) are a revolutionary advancement in human-computer interaction, offering personalised digital representations of individual knowledge, behaviour, and preferences. These sophisticated systems face fundamental challenges in ensuring the reliable organisation and processing of extensive conversational data. This thesis presents an integrated framework that addresses these challenges through advanced topic modelling, natural language processing, and information verification techniques. A key objective is to enhance HDTs' efficiency in capturing and organizing human knowledge from diverse conversational sources. This involves developing novel methods for reducing the dimensionality of conversational data while preserving essential semantic relationships. By improving the scalability of knowledge organisation, HDTs can effectively process large-scale text data without compromising information quality. Recognising the potential for errors in information retrieval, similar to the issue of "hallucinations" in Large Language Models, this thesis also explores robust verification mechanisms. Drawing parallels with detecting misinformation in human-generated content, such as fake news, the research investigates underlying patterns of inaccurate information dissemination. This analysis leads to the development of innovative approaches for ensuring response reliability in HDTs. The experimental validation of the framework encompasses various scales of text analysis, from short-form content like social media posts to long-form articles and technical papers. The results demonstrate the effectiveness of the integrated framework in organising and verifying information while maintaining semantic coherence across different textual contexts. This research contributes to the broader understanding of how HDTs can evolve into reliable partners in human-AI interaction, capable of handling complex information processing tasks while maintaining high standards of accuracy and trustworthiness. The methodologies and insights presented open new avenues for research into proactive HDT systems that can effectively serve as intelligent assistants while preserving the critical balance between capability and reliability.File | Dimensione | Formato | |
---|---|---|---|
tesi di dottorato _Luigi Serreli.pdf
accesso aperto
Dimensione
9.42 MB
Formato
Adobe PDF
|
9.42 MB | Adobe PDF | Visualizza/Apri |
I documenti in UNITESI sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.
https://hdl.handle.net/20.500.14242/208583
URN:NBN:IT:UNICA-208583