Saint George on a Bike (SGoaB) aims to improve the quality and quantity of metadata associated with imagery from European cultural heritage, using Artificial Intelligence technology, with an emphasis on the collections held by Europeana. While the AI-focused activities in the project work hard on enriching cultural images, SGoaB's activity 4, "Adoption of DSI interoperability standards for Linked Open Data deployment", focuses on ensuring the interoperability of SGoaB's results after they are produced. In particular, it looks at the requirements of existing Digital Service Infrastructures of the European Commission.
A specific focus is given to the European Data Portal (EDP), whose mission is to collect the metadata of public sector information available on public data portals across European countries. These datasets can then be more easily discovered by various actors and the general public, increasing the potential reuse and impact.
SGoaB activities would result in the provision of enrichments next to the metadata that Europeana gathers from cultural heritage institutions, which is then openly published by Europeana. As a starting point for this activity, we therefore address how to make the Europeana dataset available in the EDP as well as how to make SGoaB specific data available. Our effort is thus not only a step towards having SGoaB results represented in the EDP but also a step towards the general availability of Europeana metadata in the EDP.
A requirement for delivery to the EDP is a description of one’s datasets in a profile of an open standard from the World Wide Web Consortium (W3C), the Data Catalogue Vocabulary (DCAT). DCAT is a data model that allows to represent datasets together with their "distributions", such as a data dump or an Application Programming Interface (API). It also specifies how to express various metadata about these resources (who publishes them, when, with which license, etc).
As a part of activity 4, SGoaB has checked the guidelines for would-be EDP metadata providers and developed a specification for how to describe Europeana’s datasets and their various distributions using DCAT. This specification has also passed the automated validation process made available by the EDP. The metadata is therefore ready to be submitted to the EDP, where it will be then assessed against the EDP's Metadata Quality Assurance dashboard. This stage consists of evaluating (and reporting on) the quality of submitted datasets against quality dimensions such as Findability, Accessibility, Interoperability, and Reusability (FAIR), which have been an important target for Europeana for a while already, as explained in this post.
While these efforts are in progress you can read the current iteration of the specification in the recently published milestone report.