Speaker: Prof. Dr. Gerhard Jäger, University of Tübingen, Institute of Linguistics

Title: From Words to Features to Trees: Computing a World Tree of Languages from Word Lists
Presentation in English

Date: Monday, 16 October 2017, 11:00 a.m.
Location: Carl-Bosch-Auditorium, Studio Villa Bosch, Schloss-Wolfsbrunnenweg 33, 69118 Heidelberg (Studio entrance between Villa Bosch and HITS)
Parking: Parking garage "Unter der Boschwiese" (free of charge)

Abstract:
Since over 200 years, historical linguists strive to reconstruct family trees of human languages using systematic comparisons of vocabulary and grammar of extant or documented languages. Since about 20 years, these efforts are complemented by computational approaches, deploying phylogenetic inference algorithms from computational biology to analyse language data. So far, both lines of research have been confined to individual language families, i.e., phylogenetic units with a time depth of at most 10,000 years.
In this talk I will present and discuss a workflow that starts out from unannotated word lists from ca. 6,000 languages and dialects across the world. Using feature extraction techniques from machine learning, a feature matrix is extracted which in turn serves as input for Maximum-Likelihood phylogenetic inference (using the software RAxML). This leads to a phylogenetic tree over those languages and dialects, which is in very good agreement with expert classifications, correlates well with anthropological and genetic data, and also reveals some interesting deeper signals.

Curriculum vitae:
Gerhard Jäger (http://www.sfs.uni-tuebingen.de/~gjaeger/) is professor of General Linguistics at Tübingen University. He received his PhD and habilitation from Humboldt University at Berlin and held previous positions at Munich, UPenn, Utrecht and Stanford. He is PI of an ERC Advanced Grant "Language Evolution: The Empirical Turn" and co-PI of the interdisciplinary DFG-Research Unit "Words, Bones, Genes, Tools: Tracking Linguistic, Cultural and Biological Trajectories of the Human Past". His research interests include computational historical linguistics and game-theoretic pragmatics.

Contact:
Benedicta Frech (Diese E-Mail-Adresse ist vor Spambots geschützt! Zur Anzeige muss JavaScript eingeschaltet sein!, phone: 06221-533-263)

back to top