Semantic Web, Linked Data, and
Vocabularies
The following summarizes my notes on the domain. The notes continue my
research among others presented in:
Yet, the recommendation motivated through a recent review of the domain
might spur interest into this domain:
Answers to RQ2:
Solutions and
requirements
Based on the analyses of the nineteen
papers, we identified six essential solutions and their combination for
the successful publication of LD in the NMCAs listed: vocabularies,
ontologies, metadata, and solutions using standardised technologies
that meet requirements such as readability and serialisation. In the LD
development projects, it is
essential to provide systematic support to creators and users
developing vocabularies and ontologies and the scalability of
approaches to deal with large datasets. (ESt emphasis)
1. The basics / Introductory courses
EuroSDR/PLDN
The EduServ course: Spatial Linked (Open) Data (2020) had the objective
that after completing the course, the participants will be able to:
- understand the basic concepts of Linked Data
- explain how Linked Data can contribute to interoperability in
the
spatial domain
- explain the role of ontologies in conceptual modelling
- get insight in the business case of implementing Linked Data
(at
Kadaster)
- can perform a basic SPARQL query.
The further course description spans the RDF, SPARQL, and endpoint
concepts. Implicitly, triplestore(s) are applied as basis for the
endpoint queries. While the course 'used PLDN as a sandbox', no dataset
or course material seems available today at PLDN.
The Metadata Matters symposium 6. Sept. 2023 offers
two courses: Linked Data Introduction Course (by Triply) and SPARQL
& GeoSPARQL Course (by Triply). Course participants have to be
physically present.
LandPortal
The Land Portal Foundation has adopted a Linked Open Data Strategy and
presents details regarding the technology applied by the Land Portal
Foundation, as well as how to make best use of it. The web pages
includes a number of links to YouTube videos, including e.g. a 10 min
presentation by Tim Berners Lee (2010). A section regards Controlled
Vocabularies and Taxonomies developed by the Land Portal, which
includes LandVoc
- the Linked Land Governance Thesaurus, where a LandVoc hierarchy
offers a
nice graphical presentation.
From a learning point of view, the next section on Triple Store, SPARQL endpoint & Graphs might
be more important. Besides offering introductory information, a SPARQL
endpoint (https://landportal.org/sparql)
is available to query different graphs, drawing on a Virtuoso
Open-Source
triplestore, version 7.2.
A Linked Open Data Glossary is available, supporting
the introduction to the Linked Data domain.
2. Semantic Platforms, incl VocBench
Basel and Basic BARTOC
The Basel
Register of Thesauri, Ontologies & Classifications (BARTOC) was established by 2013 by Dr. Andreas
Ledl, Basel University Library. BARTOC builds on the library and
information science
tradition of presenting controlled and structured vocabularies, which
also framed the founding of the International
Society of Knowledge Organization (ISKO).
Across the individual types of knowledge organization systems (KOS),
BARTOC contains among others 700 ontologies (24.93%), 680 thesauri
[including CaLAThe] (24.22%), and 243 glossaries (8.65%). In contrast
to Linked Open Vocabularies (LOV),
which is to some extent a similar initiative focussing on ontologies,
BARTOC covers the whole range of controlled vocabularies. (Waeber,
Ledl, 2019)
Initially, BARTOC through a public-private-partnership with the
Semantic Web Company (Vienna), had access to ‘PoolParty Thesaurus &
Taxonomy Management Software’, a world-class tool to build and maintain
information architectures, which provides browsable visualization of
vocabularies available in SKOS format for users. The SKOS files of
vocabularies were uploaded to a Virtuoso RDF triple store, hosted by
Semantic Web Company. However, BARTOC adopted an open source policy and
appplied ‘Skosmos’, a web-based SKOS browser and publishing tool
developed by the Finnish National Library. Also, the
RDF triple store had to be changed to Apache Jena Fuseki. Authors state
that it is very easy to set up and maintain the Skosmos platform. 'It
only
took about an hour to arrange everything and have a first working
instance
running.' Mention is made that vocabularies can also be published with
open source software like TemaTres (http://www.vocabularyserver.com/)
or VocBench (http://vocbench.uniroma2.it/). (Waeber,
Ledl, 2019)
By 2020 the database moved to the Verbundzentrale des GBV (VZG),
Germany, got renamed to the Basic
Register of Thesauri, Ontologies & Classifications, and ported to a
new technical infrastructure (BARTOC).
While the Basel representation was SKOS-based, the new facility was
based on the JSKOS data format for Knowledge Organization Systems.
JSKOS combines the benefit of RDF for data aggregation and JSON for
easy access and storage by defining a set of object types such as
concepts, concept schemes, mappings, concordances and registries, and
fields for description of these objects. The JSKOS object types and
fields extend RDF classes and properties found in SKOS to support the
most used information found in KOS descriptions. (Voss et al, 2016).
BARTOC is based on a technical infrastructure developed as part of
coli-conc services: BARTOC web interface developed in JavaScript with
Vue, a JavaScript framework, the JSKOS data format for Knowledge
Organization Systems, and a JSKOS Server as backend database. All parts
are published as Open Source (BARTOC). Finally is offered a list on Software for controlled vocabularies, which
includes SKOS
Play, a free application to render and visualise thesauri,
taxonomies or controlled vocabularies expressed in SKOS.
While the Basel version was established through dialogue between the
curators and the thesaurus editors, initiated by the curators, the
curators of the Basic version expected thesaurus editors to handle the
transformation process.
VocBench
VocBench is a web-based, multilingual, collaborative development
platform for managing OWL ontologies, SKOS(/XL) thesauri, Ontolex-lemon
lexicons and generic RDF datasets. VocBench is powered by the Semantic
Turkey Knowledge Acquisition and Management framework. It is available
in the PLDN setting at http://vocbench.pldn.nl/vocbench3/#/Home
The Semantic Turkey platform is based on RDF4J
as the native, stable solution for RDF, which means that no extra
triple store is needed as for Skosmos.
3. Geospatial vocabularies beside CaLAThe
The Location
Innovation Academy has in the frame of the GeoE3 project in addition
to courses developed a Glossary (requests an account). To support the
learning process, and in line with CaLAThe presentations, a hierarchy of
geospatial terms have been drafted, available here.
Differet users and purposes motivate a certain overlap, but it would be helpful to have a shared description of the
available resources, pointing to the relative qualities of these.
-
Kless, D., Milton, S.: Towards quality measures for evaluating
thesauri. In: Sánchez-Alonso, S., Athanasiadis, I.N. (eds.) MTSR 2010.
CCIS, vol. 108, pp. 312–319. Springer, Heidelberg (2010). doi:10.1007/978-3-642-16552-8_28
- Pinto, M.: A user view of the factors affecting quality of thesauri in
social science databases. Libr. Inf. Sci. Res. 30(3), 216–221
(2008)
- Lacasta et al: An automatic method for reporting the quality of
thesauri. Data & Knowledge Engineering. Volume 104, July 2016,
Pages 1-14 https://doi.org/10.1016/j.datak.2016.05.002
4. GeoSPARQL
In Jovanovik,
Timo Homburg and
Mirko
Spasic: A
GeoSPARQL Compliance Benchmark (2021), you read:
Recently,
the EuroSDR group reused the benchmark implementation of [14] to
implement a small GeoSPARQL compliance benchmark (EuroSDR GeoSPARQL
Test: https://data.pldn.nl/eurosdr/geosparql-test
(accessed on 22 May 2021)). This compliance benchmark consists of 27
queries
testing a
selection of GeoSPARQL functions on a test dataset. In contrast to our
benchmark, this implementation does not explicitly test all
requirements
defined in the GeoSPARQL standard. In particular, GML support, RDFS
entailment support
and the query rewrite extension, among others, have not been tested in
this
benchmark.
For the land
administration and cadastral domain, GML support is essential.
5. Cadastral casework
Erik Stubkjær, 2023-08-19, -17, -11, est@plan.aau.dk