Verbal inflection in Italian, as it happens in other romance languages, is complex. Its complexity derives not just from the number of forms, each coupled with a distinct set of morphosyntactic properties –mood, tense, person, number– but also, especially, from the variability of said forms. While the process of structuring the verbal lexicon into classes can account for the variability in the ending of the inflected forms (the desinence), it can not account for the variability in the stem part, because there would be too many classes needed to classify these phenomena of allomorphism. The traditional approach requires the speaker to memorize a list of the forms whose stem part is not identical to other forms of the same paradigm, or in particular to the presentation form of the lexeme (infinitive for Italian verbs), as exceptions. In the last twenty years, there has been much interest in studying the paradigmatic distribution of allomorphy, or the way in which the variation (the traditional “irregularity”) between forms of a paradigm (not only of verbs, but also of nouns and adjectives) rests on regular schemes. Said interest has at least three directions. The first one is purely technical, suggested by the desire to pack morphological information as dense as possible to build computing efficient applications that parse, interpret, analyse, translate or produce texts (or speech), without the need to peruse enormous amounts of redundant data. The second one is cognitive: studies on the analogical associations and on how these associations form patterns and schemes can contribute to the insight on how our brain works. The third one is didactical, since the learning of languages can greatly benefit from the knowledge on such patterns of association and their operation. The practical approach of these researches has the goal of analysing the paradigmatical structure of inflection, that is, to decompose the paradigm in zones where the forms are realized on possibly distinct basic stems, and to examine the formal relations (on the phonological level) between these basic stems, studying the chains of predictability that permit us, the speakers, to handle both regular and irregular lexemes. With this work I have carried an analysis of the Italian verbal system. Following a Word and Paradigm point of view, and researches who have studied the inflectional morphology with paradigmatic approach, my goal was to build algorithms and programs to calculate relations between the word forms comprising the whole flexion of a sample of Italian verbs. The set of evaluated verbs covers all models of conjugation, including highly irregular verbs. The contribution to inflectional morphology articulates on these points: – the analysis is on the phonetic forms, as opposed to orthographic forms. I have thus developed a database for generating forms for all paradigm cells in their phonetic transcription. – the analysis is fully automated. I have developed all the algorithms needed in Java language, so that after a change in the database (for further lexemes, or possibly correction of mistakes), or even the switch to another set of data, for analysing other languages, the whole computation takes few minutes to run. – the analysis does not depend on the supposition that inflection happens at the end of the word, or by suffixation: the algorithms developed can work with discontinuous flexion (as found in Semitic languages, or partially in German and Greek, for example) with the same principles.

A computational morphological analysis of Italian verbal system

Pascoli, Matteo
2015

Abstract

Verbal inflection in Italian, as it happens in other romance languages, is complex. Its complexity derives not just from the number of forms, each coupled with a distinct set of morphosyntactic properties –mood, tense, person, number– but also, especially, from the variability of said forms. While the process of structuring the verbal lexicon into classes can account for the variability in the ending of the inflected forms (the desinence), it can not account for the variability in the stem part, because there would be too many classes needed to classify these phenomena of allomorphism. The traditional approach requires the speaker to memorize a list of the forms whose stem part is not identical to other forms of the same paradigm, or in particular to the presentation form of the lexeme (infinitive for Italian verbs), as exceptions. In the last twenty years, there has been much interest in studying the paradigmatic distribution of allomorphy, or the way in which the variation (the traditional “irregularity”) between forms of a paradigm (not only of verbs, but also of nouns and adjectives) rests on regular schemes. Said interest has at least three directions. The first one is purely technical, suggested by the desire to pack morphological information as dense as possible to build computing efficient applications that parse, interpret, analyse, translate or produce texts (or speech), without the need to peruse enormous amounts of redundant data. The second one is cognitive: studies on the analogical associations and on how these associations form patterns and schemes can contribute to the insight on how our brain works. The third one is didactical, since the learning of languages can greatly benefit from the knowledge on such patterns of association and their operation. The practical approach of these researches has the goal of analysing the paradigmatical structure of inflection, that is, to decompose the paradigm in zones where the forms are realized on possibly distinct basic stems, and to examine the formal relations (on the phonological level) between these basic stems, studying the chains of predictability that permit us, the speakers, to handle both regular and irregular lexemes. With this work I have carried an analysis of the Italian verbal system. Following a Word and Paradigm point of view, and researches who have studied the inflectional morphology with paradigmatic approach, my goal was to build algorithms and programs to calculate relations between the word forms comprising the whole flexion of a sample of Italian verbs. The set of evaluated verbs covers all models of conjugation, including highly irregular verbs. The contribution to inflectional morphology articulates on these points: – the analysis is on the phonetic forms, as opposed to orthographic forms. I have thus developed a database for generating forms for all paradigm cells in their phonetic transcription. – the analysis is fully automated. I have developed all the algorithms needed in Java language, so that after a change in the database (for further lexemes, or possibly correction of mistakes), or even the switch to another set of data, for analysing other languages, the whole computation takes few minutes to run. – the analysis does not depend on the supposition that inflection happens at the end of the word, or by suffixation: the algorithms developed can work with discontinuous flexion (as found in Semitic languages, or partially in German and Greek, for example) with the same principles.
2015
Inglese
morfologia; computazionale; paradigmi; distribuzione; italiano; morphology; computational; paradigm; distribution
157
File in questo prodotto:
File Dimensione Formato  
phd_thesis_mpascoli.pdf

accesso aperto

Dimensione 3.39 MB
Formato Adobe PDF
3.39 MB Adobe PDF Visualizza/Apri

I documenti in UNITESI sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/20.500.14242/112739
Il codice NBN di questa tesi è URN:NBN:IT:UNIVR-112739