The dissertation covers the experimentation of quantitative algorithmic procedures for the study of language evolution. In particular, the inquiry is based on the application of quantitative methods originally designed within molecular biology and population genetics to a parametric comparative dataset: The goal is to infer hypotheses regarding genealogical relationships between a specific set of languages, accounting for the role of areal convergence in linguistic variation, and to evaluate them in light of the traditional accounts provided by historical linguistics. The first focus is on the comparison between language evolution and biological evolution. The idea is that some important features of language development may also be identified drawing a parallel with the biological domain. On the whole, this analysis seems to show that language evolution and biological evolution are considerably different in some respects, but that the dissimilarities do not prevent the application of quantitative reconstruction procedures. Then most recent generative views on syntactic change are taken into consideration, showing that they are perfectly compatible with the evolutionary account outlined. To this end, basic notions regarding the cognitive-biolinguistic and the formal aspects of generative grammar are illustrated and, once the parametric perspective on synchronic language variation is clarified, the discussion is dedicated to the extension of the parametric approach to the explanation of diachronic phenomena, including genealogical development and contact. The successive step is the presentation of diverse methods of comparison adopted in historical linguistics and population genetics and, in particular, of the “Parametric comparison method”: The parallel between the latter and the procedures of investigation used in molecular biology paves the way to the introduction of the relevant quantitative techniques of phylogenetic reconstruction. After having outlined the overview of the principal datasets used so far to perform quantitative investigations on the history of languages, the parametric dataset is presented and overview of “traditional” and quantitative-based proposals regarding the genealogical classification of the languages included in the investigation is provided. The last section of the work covers the illustration of the quantitative analyses carried out. The preliminary character-based and distance-based review of the dataset is followed by the discussion on the choice of the phylogenetic methods adopted. Then the first outfit of phylogenies reconstructions on the full dataset is offered and commented on in detail. The successive focus is on possible strategies to account for homoplasy (i.e. chance and borrowing): An empirically-based selection of parameters and suggestions regarding the way in which parameters might be weighted according to their genealogical relevance are proposed. Finally, some tentative analyses concerning the possibility of detecting and accounting for borrowing in phylogenetic trees, the reconstruction of ancestral states and the mapping of syntactic distances onto the diachronic and the diatopic dimensions of variation are introduced. On the whole, the quantitative analyses appear to provide good indications of diverse facts: That phylogenetic techniques are to a large extent effectively applicable to the study of syntactic evolution, that the parametric comparison may successfully help shedding light on both short- and long-range genealogical relationships, and that traces of proper genealogical relatedness are likely to be preserved (and to be recoverable despite homoplasy) at the level of “macro-comparison”, like that instantiated in the parametric data.

A quantitative approach to the study of syntactic evolution

2009

Abstract

The dissertation covers the experimentation of quantitative algorithmic procedures for the study of language evolution. In particular, the inquiry is based on the application of quantitative methods originally designed within molecular biology and population genetics to a parametric comparative dataset: The goal is to infer hypotheses regarding genealogical relationships between a specific set of languages, accounting for the role of areal convergence in linguistic variation, and to evaluate them in light of the traditional accounts provided by historical linguistics. The first focus is on the comparison between language evolution and biological evolution. The idea is that some important features of language development may also be identified drawing a parallel with the biological domain. On the whole, this analysis seems to show that language evolution and biological evolution are considerably different in some respects, but that the dissimilarities do not prevent the application of quantitative reconstruction procedures. Then most recent generative views on syntactic change are taken into consideration, showing that they are perfectly compatible with the evolutionary account outlined. To this end, basic notions regarding the cognitive-biolinguistic and the formal aspects of generative grammar are illustrated and, once the parametric perspective on synchronic language variation is clarified, the discussion is dedicated to the extension of the parametric approach to the explanation of diachronic phenomena, including genealogical development and contact. The successive step is the presentation of diverse methods of comparison adopted in historical linguistics and population genetics and, in particular, of the “Parametric comparison method”: The parallel between the latter and the procedures of investigation used in molecular biology paves the way to the introduction of the relevant quantitative techniques of phylogenetic reconstruction. After having outlined the overview of the principal datasets used so far to perform quantitative investigations on the history of languages, the parametric dataset is presented and overview of “traditional” and quantitative-based proposals regarding the genealogical classification of the languages included in the investigation is provided. The last section of the work covers the illustration of the quantitative analyses carried out. The preliminary character-based and distance-based review of the dataset is followed by the discussion on the choice of the phylogenetic methods adopted. Then the first outfit of phylogenies reconstructions on the full dataset is offered and commented on in detail. The successive focus is on possible strategies to account for homoplasy (i.e. chance and borrowing): An empirically-based selection of parameters and suggestions regarding the way in which parameters might be weighted according to their genealogical relevance are proposed. Finally, some tentative analyses concerning the possibility of detecting and accounting for borrowing in phylogenetic trees, the reconstruction of ancestral states and the mapping of syntactic distances onto the diachronic and the diatopic dimensions of variation are introduced. On the whole, the quantitative analyses appear to provide good indications of diverse facts: That phylogenetic techniques are to a large extent effectively applicable to the study of syntactic evolution, that the parametric comparison may successfully help shedding light on both short- and long-range genealogical relationships, and that traces of proper genealogical relatedness are likely to be preserved (and to be recoverable despite homoplasy) at the level of “macro-comparison”, like that instantiated in the parametric data.
5-mag-2009
Italiano
Longobardi, Giuseppe
Lenci, Alessandro
Università degli Studi di Pisa
File in questo prodotto:
File Dimensione Formato  
Gabriele_Rigon.pdf

accesso aperto

Tipologia: Altro materiale allegato
Dimensione 3.52 MB
Formato Adobe PDF
3.52 MB Adobe PDF Visualizza/Apri

I documenti in UNITESI sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/20.500.14242/150822
Il codice NBN di questa tesi è URN:NBN:IT:UNIPI-150822