18 III: Marco Passarotti, Francesco Mambrini (ERC-CoG project LiLa: Linking Latin / Università Cattolica del Sacro Cuore, Milan) Interlinking through Lemmas. The LiLa Knowledge Base of Interlinked Linguistic Resources for Latin
Abstract
The talk presents the LiLa Knowledge Base (https://lila-erc.eu/), a collection of multifarious linguistic resources for Latin. Resources in LiLa are described with the same vocabulary of knowledge description and interlinked according to the principles of the so-called Linked Data paradigm.
Following its highly lexically based nature, the core of the LiLa Knowledge Base consists of a large collection of Latin lemmas. These canoncal forms serve as the backbone to achieve interoperability between the resources, by linking all the entries in lexical resources and tokens in corpora that point to the same lemma. After detailing the architecture supporting LiLa, the talk:
a) describes the LiLa collection of lemmas, particularly focussing on how the Knowledge Base approaches the challenges raised by harmonizing different strategies of lemmatization that can be found in linguistic resources for Latin;
b) details the modeling and linking of a number of textual and lexical resources for Latin, including a dependency treebank, an etymological dictionary and a polarity lexicon;
c) presents some SPARQL queries to extract information taken from the interoperable resources currently linked to LiLa, and shows the prototype of a tool to automatically link a raw Latin text to the Knowledge Base.