LDK 2021 tutorial
Language, Data & Knowledge (LDK)
This tutorial will be held during the third edition of the Language, Data & Knowledge conference (LDK 2021). COVID-permitting, the conference will be held between 1-3 September in Zaragoza, Spain.
Date and time of the activity
September 1, 2021: 9:00-13:00 (CEST).
Location
Zaragoza, Spain & online.
Registration
**REGISTRATION IS NOW CLOSED**
Description
This tutorial falls within the area of Linguistic Linked Open Data (LLOD). By applying Linked Data and FAIR principles, the LiLa: Linking Latin project makes linguistic resources (e.g. textual corpora, lexica, dictionaries) for Latin interact on the web via a lexical basis made of a collection of lemmas known as the LiLa lemma bank.
Objective
In this hands-on tutorial, participants will learn how to link a Latin text to the LiLa Knowledge Base of linguistic resources. By the end of the tutorial participants will have a better understanding of 1) the benefits of linking a Latin text to the LiLa Knowledge Base, and 2) the work required to help machines process linguistic data and produce quality resources.
Target audience
This activity is intended for those who wish to publish Latin texts on the web (e.g. computational linguists, theoretical linguists, classicists, philologists) and to connect them to the wealth of interoperable linguistic resources already linked to LiLa. No prior experience of Natural Language Processing and Linked Data technologies is required but participants are expected to have some basic understanding of lemmatisation, Part-of-Speech (PoS) tagging and Linked Data. While knowledge of Latin is preferable, participants who don’t know Latin but are nevertheless interested in the project and its methods are also welcome to join.
Materials and technical requirements
The text and tools necessary to participate in the event will be provided by the LiLa team before and during the tutorial. Check the following GitHub repository for updates: https://github.com/CIRCSE/Tutorials/tree/main/LDK21.
The tutorial is designed to work with desktop computers and laptops, *not* tablets or smartphones.
Programme
The activity will consist of two parts. Part 1. Theory: Presentation of the structure of the LiLa Knowledge Base. Part 2. Practice : Preparing and linking a Latin text to LiLa. Workflow outline:
- Automatic assessment of the lexical overlap between the Latin text and the LiLa lemma bank
- Disambiguation of ambiguous matches
- Correction of PoS-tagging/lemmatisation errors
- Conversion of the lemmatised text to RDF Turtle syntax for inclusion in the LiLa Knowledge Base
- Querying the newly added text and the entire Knowledge Base with the RDF query language SPARQL
On Twitter, look for the hashtag #ldk2021 and/or the handle @ERC_LiLa to follow updates from this tutorial.