Semantic Web, Linked Data, and Vocabularies

The following summarizes my notes on the domain. The notes continue my research among others presented in:
Yet, the recommendation motivated through a recent review of the domain might spur interest into this domain:
    Answers to RQ2: Solutions and requirements
Based on the analyses of the nineteen papers, we identified six essential solutions and their combination for the successful publication of LD in the NMCAs listed: vocabularies, ontologies, metadata, and solutions using standardised technologies that meet requirements such as readability and serialisation. In the LD development projects, it is essential to provide systematic support to creators and users developing vocabularies and ontologies and the scalability of approaches to deal with large datasets. (ESt emphasis)

1. The basics / Introductory courses

EuroSDR/PLDN

The EduServ course: Spatial Linked (Open) Data (2020) had the objective that after completing the course, the participants will be able to: The further course description spans the RDF, SPARQL, and endpoint concepts. Implicitly, triplestore(s) are applied as basis for the endpoint queries. While the course 'used PLDN as a sandbox', no dataset or course material seems available today at PLDN.

The Metadata Matters symposium 6. Sept. 2023 offers two courses: Linked Data Introduction Course (by Triply) and SPARQL & GeoSPARQL Course (by Triply). Course participants have to be physically present.

LandPortal

The Land Portal Foundation has adopted a Linked Open Data Strategy and presents details regarding the technology applied by the Land Portal Foundation, as well as how to make best use of it. The web pages includes a number of links to YouTube videos, including e.g. a 10 min presentation by Tim Berners Lee (2010). A section regards Controlled Vocabularies and Taxonomies developed by the Land Portal, which includes LandVoc - the Linked Land Governance Thesaurus, where a LandVoc hierarchy offers a nice graphical presentation.
From a learning point of view, the next section on Triple Store, SPARQL endpoint & Graphs might be more important. Besides offering introductory information, a SPARQL endpoint (https://landportal.org/sparql) is available to query different graphs, drawing on a Virtuoso Open-Source triplestore, version 7.2.
A Linked Open Data Glossary is available, supporting the introduction to the Linked Data domain.

2. Semantic Platforms, incl VocBench

Basel and Basic BARTOC

The Basel Register of Thesauri, Ontologies & Classifications (BARTOC) was established by 2013 by Dr. Andreas Ledl, Basel University Library. BARTOC builds on the library and information science tradition of presenting controlled and structured vocabularies, which also framed the founding of the International Society of Knowledge Organization (ISKO). Across the individual types of knowledge organization systems (KOS), BARTOC contains among others 700 ontologies (24.93%), 680 thesauri [including CaLAThe] (24.22%), and 243 glossaries (8.65%). In contrast to Linked Open Vocabularies (LOV), which is to some extent a similar initiative focussing on ontologies, BARTOC covers the whole range of controlled vocabularies. (Waeber, Ledl, 2019)
Initially, BARTOC through a public-private-partnership with the Semantic Web Company (Vienna), had access to ‘PoolParty Thesaurus & Taxonomy Management Software’, a world-class tool to build and maintain information architectures, which provides browsable visualization of vocabularies available in SKOS format for users. The SKOS files of vocabularies were uploaded to a Virtuoso RDF triple store, hosted by Semantic Web Company. However, BARTOC adopted an open source policy and appplied ‘Skosmos’, a web-based SKOS browser and publishing tool developed by the Finnish National Library. Also, the RDF triple store had to be changed to Apache Jena Fuseki. Authors state that it is very easy to set up and maintain the Skosmos platform. 'It only took about an hour to arrange everything and have a first working instance running.' Mention is made that vocabularies can also be published with open source software like TemaTres (http://www.vocabularyserver.com/) or VocBench (http://vocbench.uniroma2.it/). (Waeber, Ledl, 2019)
By 2020 the database moved to the Verbundzentrale des GBV (VZG), Germany, got renamed to the Basic Register of Thesauri, Ontologies & Classifications, and ported to a new technical infrastructure (BARTOC). While the Basel representation was SKOS-based, the new facility was based on the JSKOS data format for Knowledge Organization Systems. JSKOS combines the benefit of RDF for data aggregation and JSON for easy access and storage by defining a set of object types such as concepts, concept schemes, mappings, concordances and registries, and fields for description of these objects. The JSKOS object types and fields extend RDF classes and properties found in SKOS to support the most used information found in KOS descriptions. (Voss et al, 2016).
BARTOC is based on a technical infrastructure developed as part of coli-conc services: BARTOC web interface developed in JavaScript with Vue, a JavaScript framework, the JSKOS data format for Knowledge Organization Systems, and a JSKOS Server as backend database. All parts are published as Open Source (BARTOC). Finally is offered a list on Software for controlled vocabularies, which includes SKOS Play, a free application to render and visualise thesauri, taxonomies or controlled vocabularies expressed in SKOS.

While the Basel version was established through dialogue between the curators and the thesaurus editors, initiated by the curators, the curators of the Basic version expected thesaurus editors to handle the transformation process.

VocBench

VocBench is a web-based, multilingual, collaborative development platform for managing OWL ontologies, SKOS(/XL) thesauri, Ontolex-lemon lexicons and generic RDF datasets. VocBench is powered by the Semantic Turkey Knowledge Acquisition and Management framework. It is available in the PLDN setting at http://vocbench.pldn.nl/vocbench3/#/Home
The Semantic Turkey platform is based on RDF4J as the native, stable solution for RDF, which means that no extra triple store is needed as for Skosmos.

3. Geospatial vocabularies beside CaLAThe

The Location Innovation Academy has in the frame of the GeoE3 project in addition to courses developed a Glossary (requests an account). To support the learning process, and in line with CaLAThe  presentations, a hierarchy of geospatial terms have been drafted, available here.

Differet users and purposes motivate a certain overlap, but it would be helpful to have a shared description of the available resources, pointing to the relative qualities of these.

4. GeoSPARQL

In Jovanovik, Timo Homburg and Mirko Spasic: A GeoSPARQL Compliance Benchmark (2021), you read:

Recently, the EuroSDR group reused the benchmark implementation of [14] to implement a small GeoSPARQL compliance benchmark (EuroSDR GeoSPARQL Test: https://data.pldn.nl/eurosdr/geosparql-test (accessed on 22 May 2021)). This compliance benchmark consists of 27 queries testing a selection of GeoSPARQL functions on a test dataset. In contrast to our benchmark, this implementation does not explicitly test all requirements defined in the GeoSPARQL standard. In particular, GML support, RDFS entailment support and the query rewrite extension, among others, have not been tested in this benchmark.

For the land administration and cadastral domain, GML support is essential.

5. Cadastral casework



Erik Stubkjær, 2023-08-19, -17, -11, est@plan.aau.dk