LiLa: Linking Latin

Building a Knowledge Base of Linguistic Resources for Latin

Invited Talk at the Fred Jelinek Seminar Series

Marco Passarotti will be the next invited speaker at the Fred Jelinek Seminar Series organized by the Institute of Formal and Applied Linguistics at the Charles University, Czech Republic.

The talk will be streamed via Zoom on Monday, 26 April, 2021 – 14:00. For details on how to join the Zoom meeting, please write to sevcikova et ufal.mff.cuni.cz


Title: Words and Classes. Branches and Links. Interlinking (Latin) Resources in the Linguistic Linked Open Data World through the LiLa Knowledge Base

Abstract: The talk presents the LiLa Knowledge Base (https://lila-erc.eu), a collection of multifarious linguistic resources for Latin described with the same vocabulary of knowledge description and interlinked according to the principles of the so-called Linked Data paradigm. Following its highly lexically based nature, the core of the LiLa Knowledge Base consists of a large collection of Latin lemmas, serving as the backbone to achieve interoperability between the resources, by linking all those entries in lexical resources and tokens in corpora that point to the same lemma. After detailing the architecture supporting LiLa, the talk: a) describes the LiLa collection of lemmas, particularly focussing on how the Knowledge Base approaches the challenges raised by harmonizing different strategies of lemmatization that can be found in linguistic resources for Latin; b) details the modeling and linking of a number of textual and lexical resources for Latin, including a dependency treebank, an etymological dictionary, a polarity lexicon, a derivational lexicon and a valency lexicon;
c) shows the prototype of a tool to automatically link a raw Latin text to the Knowledge Base and presents some SPARQL queries to extract information taken from the interoperable resources currently linked to LiLa.