LiLa: Linking Latin

Building a Knowledge Base of Linguistic Resources for Latin

CIRCSE Resources in CLARIN-IT

A set of linguistic resources for Latin developed in the CIRCSE research center and within the LiLa project is now available in a dedicated collection of the ILC4CLARIN repository of CLARIN-IT: https://dspace-clarin-it.ilc.cnr.it/repository/xmlui/handle/000-c0-111/525

Resources currently available are:

  • LiLa Lemma Bank: large collection of Latin lemmas each described with a set of grammatical and morphological information;
  • Index Thomisticus Treebank: analytical and tectogrammatical annotation of a portion of the Index Thomisticus corpus;
  • Latin Vallex v.1: valency lexicon;
  • LatinAffectus: prior polarity lexicon;
  • Index Graecorum Vocabulorum in Linguam Latinam: manually-corrected OCR of G.A. Saalfeld’s list of Latin loans from Ancient Greek (1874);
  • Word Formation Latin: derivational morphology lexicon;
  • EvaLatin 2020 Data: training and gold test data for lemmatizers and PoS taggers;
  • The Etymological Dictionary of Latin and the other Italic Languages: collection of Proto-Italic and Proto-Indo-European reconstructed forms.