The EU Knowledge Graph

Dennis Diefenbach (The QA Company)
Max De Wilde (DG CNECT, European Commission)
Anne Thollard (DG REGIO, European Commission)
@Semantics 2021 (hybrid)
07.09.2021

Outline

  1. Infrastructure: Wikibase
  2. The EU Knowledge Graph
  3. Kohesio
  4. Conclusion

What is Wikibase?

Wikimedia hosts many wikis...
Wikidata is one of them!
Wikibase is the software behind Wikidata

The EU Knowledge Graph

A data repository to store structured data about the European Union
Available at https://knowledgegraph.eu/

Why using Wikibase?

User-friendly
Graph structure
Can be queried
Can be edited by humans and by bots
https://linkedopendata.eu/wiki/Item:Q1
Scales well
Wikibase hosts Wikidata, one of the largest existing KG which contains 5 billion triples
Multilingual
https://linkedopendata.eu/wiki/Item:Q1
Full track of changes!

Current content...

European institutions
European countries
Capital cities
Heads of States
Directorates-General (DGs)
Buildings and canteens
960.000 projects co-funded by the European Union
263.000 beneficiaries of EU funds
Linked Data Solutions

Importing data...

1. Take any structured data

2. Model the data

  • We need concepts like building, office...
  • We need properties like address, opening hours, occupant...
  • Whenever possible, reuse existing material (Wikidata...)

3. Keep identifiers

Use external identifiers in order to establish a link with other resources

4. Import using Wikibase APIs

We always use Pywikibot
But there are alternatives...
The data imported understandable, aligned with existing concepts, queryable and easy to reuse

Keeping the data fresh...

1. Entities imported from Wikidata

Wikidata
EU Knowledge Graph

WikidataUpdater

A bot that checks that the data is synchronised
Refreshed every 5 minutes!

Services provided...

1. Data exports

Available at https://data.linkedopendata.eu

2. Query Service

Available at https://query.linkedopendata.eu

3. Question Answering

We allow to query and explore the KG using natural language
Available at https://qa.linkedopendata.eu

Can be integrated in chatbots

Available at https://chatbot.cnect.eu/

Kohesio

Transparency on programmes and projects co-funded by the EU

EU Cohesion Policy

  • EU Cohesion Policy supports every year tens of thousands of projects across Europe
  • Cohesion Policy makes up approximately 32.5 % of the EU budget 2014-2020 (about 350 billion euros)

What is Kohesio?

  • Cohesion funds are managed together with national and local authorities in the 27 EU member states
  • The member states have a legal obligation to publish the list of projects and beneficiaries on their national websites
  • The goal of Kohesio is to aggregate this data and make it publicly available in an easy, open way

Data

  1. Dozens of files in CSV, XLSX or XLS describing the projects of EU member states
  2. Around 15 files describing vocabulary specific to Cohesion Policy: categories of intervention, thematic objectives, etc.
  3. Data about geographic entities (NUTS)
  4. Wikidata

Enriching the data

  1. Translating project labels and descriptions into English
  2. Computing geographic coordinates based on postal code (geocoding)
  3. Deducing in which NUTS region the project is located
  4. Linking NUTS regions and beneficiaries with Wikidata

Build a website for citizens on top

Available at https://kohesio.eu

To summarize

  1. We integrated structured data from different sources into one uniform model
  2. The data is enriched in various ways, thereby increasing its value
  3. The data is openly accessible by citizens, showing the impact of EU funds in their region

Conclusion

We have shown:
  1. how Wikibase is used as the underlying infrastructure for the EU Knowledge Graph
  2. the contents of the EU KG, how the data is ingested, maintained fresh and which services are offered
  3. a concrete use case: Kohesio
Acknowledgements:
  • DORIS Team @ DG CNECT
  • Knowledge Management Team @ DG REGIO
  • Wikimedia Deutschland (WMDE)
Thank you! Questions?