LDK 2021 tutorial
Language, Data & Knowledge (LDK)
Date and time of the activity
September 1, 2021.
The LDK’21 conference is planned as a face-to-face event, but will also allow remote presentations by participants who cannot attend the conference in person owing to COVID-19 travelling restrictions.
This tutorial falls within the area of Linguistic Linked Open Data (LLOD). By applying Linked Data and FAIR principles, the LiLa: Linking Latin project makes linguistic resources (e.g. textual corpora, lexica, dictionaries) for Latin interact on the web via a lexical basis made of a collection of lemmas known as the LiLa lemma bank.
In this hands-on tutorial, participants will learn how to link a Latin text to the LiLa Knowledge Base of linguistic resources. By the end of the tutorial participants will have a better understanding of 1) the benefits of linking a Latin text to the LiLa Knowledge Base, and 2) the work required to help machines process linguistic data and produce quality resources.
This activity is intended for those who wish to publish Latin texts on the web (e.g. computational linguists, theoretical linguists, classicists, philologists) and to connect them to the wealth of interoperable linguistic resources already linked to LiLa. No prior experience of Natural Language Processing and Linked Data technologies is required but participants are expected to have some basic understanding of lemmatisation, Part-of-Speech (PoS) tagging and Linked Data. While knowledge of Latin is preferable, participants who don’t know Latin but are nevertheless interested in the project and its methods are also welcome to join.
Materials and technical requirements
The text and tools necessary to participate in the event will be provided by the LiLa team before and during the tutorial. The tutorial is designed to work with desktop computers and laptops, *not* tablets or smartphones.
The activity will consist of two parts. Part 1. Theory: Presentation of the structure of the LiLa Knowledge Base. Part 2. Practice: Preparing and linking a Latin text to LiLa. Workflow outline:
- Automatic assessment of the lexical overlap between the Latin text and the LiLa lemma bank
- Disambiguation of ambiguous matches
- Correction of PoS-tagging/lemmatisation errors
- Addition of lemmas currently missing from LiLa
- Conversion of the lemmatised text to RDF Turtle syntax for inclusion in the LiLa Knowledge Base
- Querying the newly added text and the entire Knowledge Base with the RDF query language SPARQL
On Twitter, look for the hashtag #ldk2021 and/or the handle @ERC_LiLa to follow updates from this tutorial.