LiLa: Linking Latin

Building a Knowledge Base of Linguistic Resources for Latin

LDK 2021 tutorial

Language, Data & Knowledge (LDK)

This tutorial will be held during the third edition of the Language, Data & Knowledge conference (LDK 2021). COVID-permitting, the conference will be held between 1-3 September in Zaragoza, Spain.

Date and time of the activity

September 1, 2021.

Location

Zaragoza, Spain.

The LDK’21 conference is planned as a face-to-face event, but will also allow remote presentations by participants who cannot attend the conference in person owing to COVID-19 travelling restrictions. 

Description

This tutorial falls within the area of Linguistic Linked Open Data (LLOD). By applying Linked Data and FAIR principles, the LiLa: Linking Latin project makes linguistic resources (e.g. textual corpora, lexica, dictionaries) for Latin interact on the web via a lexical basis made of a collection of lemmas known as the LiLa lemma bank.

Objective

In this hands-on tutorial, participants will learn how to link a Latin text to the LiLa Knowledge Base of linguistic resources. By the end of the tutorial participants will have a better understanding of 1) the benefits of linking a Latin text to the LiLa Knowledge Base, and 2) the work required to help machines process linguistic data and produce quality resources.

Target audience

This activity is intended for those who wish to publish Latin texts on the web (e.g. computational linguists, theoretical linguists, classicists, philologists) and to connect them to the wealth of interoperable linguistic resources already linked to LiLa. No prior experience of Natural Language Processing and Linked Data technologies is required but participants are expected to have some basic understanding of lemmatisation, Part-of-Speech (PoS) tagging and Linked Data. While knowledge of Latin is preferable, participants who don’t know Latin but are nevertheless interested in the project and its methods are also welcome to join.

Materials and technical requirements

The text and tools necessary to participate in the event will be provided by the LiLa team before and during the tutorial. The tutorial is designed to work with desktop computers and laptops, *not* tablets or smartphones.

Programme

The activity will consist of two parts. Part 1. Theory: Presentation of the structure of the LiLa Knowledge Base. Part 2. Practice: Preparing and linking a Latin text to LiLa. Workflow outline:

  • Automatic assessment of the lexical overlap between the Latin text and the LiLa lemma bank
    • Disambiguation of ambiguous matches
    • Correction of PoS-tagging/lemmatisation errors
    • Addition of lemmas currently missing from LiLa
  • Conversion of the lemmatised text to RDF Turtle syntax for inclusion in the LiLa Knowledge Base
  • Querying the newly added text and the entire Knowledge Base with the RDF query language SPARQL

Twitter

On Twitter, look for the hashtag #ldk2021 and/or the handle @ERC_LiLa to follow updates from this tutorial.

Registration

TBA