The dissertation covers the experimentation of quantitative algorithmic procedures for the study of language evolution. In particular, the inquiry is based on the application of quantitative methods originally designed within molecular biology and population genetics to a parametric comparative dataset: The goal is to infer hypotheses regarding genealogical relationships between a specific set of languages, accounting for the role of areal convergence in linguistic variation, and to evaluate them in light of the traditional accounts provided by historical linguistics. The first focus is on the comparison between language evolution and biological evolution. The idea is that some important features of language development may also be identified drawing a parallel with the biological domain. On the whole, this analysis seems to show that language evolution and biological evolution are considerably different in some respects, but that the dissimilarities do not prevent the application of quantitative reconstruction procedures. Then most recent generative views on syntactic change are taken into consideration, showing that they are perfectly compatible with the evolutionary account outlined. To this end, basic notions regarding the cognitive-biolinguistic and the formal aspects of generative grammar are illustrated and, once the parametric perspective on synchronic language variation is clarified, the discussion is dedicated to the extension of the parametric approach to the explanation of diachronic phenomena, including genealogical development and contact. The successive step is the presentation of diverse methods of comparison adopted in historical linguistics and population genetics and, in particular, of the “Parametric comparison method”: The parallel between the latter and the procedures of investigation used in molecular biology paves the way to the introduction of the relevant quantitative techniques of phylogenetic reconstruction. After having outlined the overview of the principal datasets used so far to perform quantitative investigations on the history of languages, the parametric dataset is presented and overview of “traditional” and quantitative-based proposals regarding the genealogical classification of the languages included in the investigation is provided. The last section of the work covers the illustration of the quantitative analyses carried out. The preliminary character-based and distance-based review of the dataset is followed by the discussion on the choice of the phylogenetic methods adopted. Then the first outfit of phylogenies reconstructions on the full dataset is offered and commented on in detail. The successive focus is on possible strategies to account for homoplasy (i.e. chance and borrowing): An empirically-based selection of parameters and suggestions regarding the way in which parameters might be weighted according to their genealogical relevance are proposed. Finally, some tentative analyses concerning the possibility of detecting and accounting for borrowing in phylogenetic trees, the reconstruction of ancestral states and the mapping of syntactic distances onto the diachronic and the diatopic dimensions of variation are introduced. On the whole, the quantitative analyses appear to provide good indications of diverse facts: That phylogenetic techniques are to a large extent effectively applicable to the study of syntactic evolution, that the parametric comparison may successfully help shedding light on both short- and long-range genealogical relationships, and that traces of proper genealogical relatedness are likely to be preserved (and to be recoverable despite homoplasy) at the level of “macro-comparison”, like that instantiated in the parametric data.
A quantitative approach to the study of syntactic evolution
2009
Abstract
The dissertation covers the experimentation of quantitative algorithmic procedures for the study of language evolution. In particular, the inquiry is based on the application of quantitative methods originally designed within molecular biology and population genetics to a parametric comparative dataset: The goal is to infer hypotheses regarding genealogical relationships between a specific set of languages, accounting for the role of areal convergence in linguistic variation, and to evaluate them in light of the traditional accounts provided by historical linguistics. The first focus is on the comparison between language evolution and biological evolution. The idea is that some important features of language development may also be identified drawing a parallel with the biological domain. On the whole, this analysis seems to show that language evolution and biological evolution are considerably different in some respects, but that the dissimilarities do not prevent the application of quantitative reconstruction procedures. Then most recent generative views on syntactic change are taken into consideration, showing that they are perfectly compatible with the evolutionary account outlined. To this end, basic notions regarding the cognitive-biolinguistic and the formal aspects of generative grammar are illustrated and, once the parametric perspective on synchronic language variation is clarified, the discussion is dedicated to the extension of the parametric approach to the explanation of diachronic phenomena, including genealogical development and contact. The successive step is the presentation of diverse methods of comparison adopted in historical linguistics and population genetics and, in particular, of the “Parametric comparison method”: The parallel between the latter and the procedures of investigation used in molecular biology paves the way to the introduction of the relevant quantitative techniques of phylogenetic reconstruction. After having outlined the overview of the principal datasets used so far to perform quantitative investigations on the history of languages, the parametric dataset is presented and overview of “traditional” and quantitative-based proposals regarding the genealogical classification of the languages included in the investigation is provided. The last section of the work covers the illustration of the quantitative analyses carried out. The preliminary character-based and distance-based review of the dataset is followed by the discussion on the choice of the phylogenetic methods adopted. Then the first outfit of phylogenies reconstructions on the full dataset is offered and commented on in detail. The successive focus is on possible strategies to account for homoplasy (i.e. chance and borrowing): An empirically-based selection of parameters and suggestions regarding the way in which parameters might be weighted according to their genealogical relevance are proposed. Finally, some tentative analyses concerning the possibility of detecting and accounting for borrowing in phylogenetic trees, the reconstruction of ancestral states and the mapping of syntactic distances onto the diachronic and the diatopic dimensions of variation are introduced. On the whole, the quantitative analyses appear to provide good indications of diverse facts: That phylogenetic techniques are to a large extent effectively applicable to the study of syntactic evolution, that the parametric comparison may successfully help shedding light on both short- and long-range genealogical relationships, and that traces of proper genealogical relatedness are likely to be preserved (and to be recoverable despite homoplasy) at the level of “macro-comparison”, like that instantiated in the parametric data.File | Dimensione | Formato | |
---|---|---|---|
Gabriele_Rigon.pdf
accesso aperto
Tipologia:
Altro materiale allegato
Dimensione
3.52 MB
Formato
Adobe PDF
|
3.52 MB | Adobe PDF | Visualizza/Apri |
I documenti in UNITESI sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.
https://hdl.handle.net/20.500.14242/150822
URN:NBN:IT:UNIPI-150822