Language is an essential part of any society to thrive. Lexical resources are the building blocks of any language; they allow us to find similarities and diversities when comparing languages. However, numerous limitations like funding or lack of expert support hinder language resource development, and consequently, many minor languages are becoming extinct. A possible way to preserve a language is by connecting the lexical resources with famous languages like English. However, the reference language might influence the language development and mapping process. This thesis suggests a methodology for language development and mapping to avoid the supremacy of a reference language. Hence, the thesis presents a strategy to conserve languages to combat one language’s dominance over another in the resource. The methodology proposed builds improved and up-to-date concept-oriented multilingual lexical resources from existing ones. The advantage of having such resources is that we can use them to compare the languages, study the differences and similarities, and exploit the information to measure and improve the quality of the languages. Similarly, this thesis shows the importance of the structural organization of multilingual resources to represent the meaning across languages. This thesis focuses on Indian languages, but the methodologies explained are adaptable to be used for any other language. The main outcomes of this thesis are (i) a methodology to create a multilingual resource that does not depend on a reference language and (ii) to present a good quality concept-oriented resource for various Indian languages for the community to preserve the culture.

Building language-independent culture-aware multilingual lexical resources

Chandran Nair, Nandu
2022

Abstract

Language is an essential part of any society to thrive. Lexical resources are the building blocks of any language; they allow us to find similarities and diversities when comparing languages. However, numerous limitations like funding or lack of expert support hinder language resource development, and consequently, many minor languages are becoming extinct. A possible way to preserve a language is by connecting the lexical resources with famous languages like English. However, the reference language might influence the language development and mapping process. This thesis suggests a methodology for language development and mapping to avoid the supremacy of a reference language. Hence, the thesis presents a strategy to conserve languages to combat one language’s dominance over another in the resource. The methodology proposed builds improved and up-to-date concept-oriented multilingual lexical resources from existing ones. The advantage of having such resources is that we can use them to compare the languages, study the differences and similarities, and exploit the information to measure and improve the quality of the languages. Similarly, this thesis shows the importance of the structural organization of multilingual resources to represent the meaning across languages. This thesis focuses on Indian languages, but the methodologies explained are adaptable to be used for any other language. The main outcomes of this thesis are (i) a methodology to create a multilingual resource that does not depend on a reference language and (ii) to present a good quality concept-oriented resource for various Indian languages for the community to preserve the culture.
8-nov-2022
Inglese
Giunchiglia, Fausto
Università degli studi di Trento
TRENTO
115
File in questo prodotto:
File Dimensione Formato  
phd_unitn_Chandran Nair_Nandu.pdf

accesso aperto

Dimensione 7.55 MB
Formato Adobe PDF
7.55 MB Adobe PDF Visualizza/Apri

I documenti in UNITESI sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/20.500.14242/60518
Il codice NBN di questa tesi è URN:NBN:IT:UNITN-60518