The Probabilistic Typology of Vowel Systems

Embargo until
2025-08-01
Date
2021-07-19
Journal Title
Journal ISSN
Volume Title
Publisher
Johns Hopkins University
Abstract
Linguistic typology studies the range of structures present in human language. The main goal of the field is to discover which sets of possible phenomena are universal, and which are merely frequent. For example, all languages have vowels, while most—but not all—languages have an [u] sound. In this paper we present the first probabilistic treatment of a basic question in phonological typology: What makes a natural vowel inventory? We introduce a series of deep probability models. In Chapter 1, we give an overview of the relevant background material in phonetics and the typology of vowel systems. In Chapter 2, we introduce a series of deep stochastic point processes, and contrast them with previous computational, simulation-based approaches. We provide a comprehensive suite of experiments on over 200 distinct languages. In Chapter 3, we work directly with the acoustic information—the first two formant values—rather than modeling discrete sets of phonemic symbols (IPA). We develop a novel generative probability model and report results based on the same corpus of over 200 languages. In Chapter 4, we focus on a functional account of vowel systems. The typology of vowel systems can, in part, be explained in part by functional pressures on communication. We find that a model of vowel token formants is more predictive of held-out data if it is trained with the help of this prior (that is, by MAP rather than ML).
Description
Keywords
Computer science, natural language processing, computational linguistics, linguistics, linguistic typology, phonology, phonetics
Citation