5.5 Proof-of-concept of the joint search interface for Nordic and Baltic legislative databases
Finland and Estonia are currently (2023) undertaking a proof-of-concept project to map out what kind of data requirements a common legislative interface between countries would have and what would be required of each country interested in participating in common legislative search interface and its development.
Establishing joint standards and principles for the development of legal databases in the Nordic and Baltic countries would significantly improve the availability and accessibility of information, as well as data interoperability in the future. The need for joint standards and principles is essential from the multitude of ongoing initiatives promoting cross-border data exchange on the EU-level, as well as the roadmaps published by existing search portals, such as N-Lex. The European Digital Strategy and EU Data Act (see Data Act on page 11 for more information), as well as SDG OOTS (see SDG OOTS on page 15 for more information) promote the sharing of information between public administration and citizens across the EU.
The benefits of utilising common standards include improvements in the quality, accessibility, and reliability of legislative data, in terms of both human and machine-readability. Utilising common standards should therefore be considered as a key goal in the future development of legal databases in all countries. The European Eli-standard is most prominent available standard for legal data. In the Nordic-Baltic countries only Finland, Norway, and Denmark have implemented ELI identifiers so far.
The proof-of-concept (PoC) taking place between Finland and Estonia in 2023 will provide valuable insight for establishing common standards and principles. In addition to providing information on the development requirements related to the standardisation of data sets, this PoC will also result in valuable hands-on experience other countries can utilise in their respective development efforts. The goal of the PoC phase is to identify the data requirements for a shared interface between countries and create a demonstrator of an application for retrieving information from Finland and Estonia’s legal databases. The PoC is aimed at testing the practical functionality of the search interface. The PoC complements text search with a faceted search. This allows user to search and filter documents based on metadata such as enforcement dates or keywords. The faceted search also shows all the available selections with their hit counts. Aalto University’s Semantic Computing research group carried out the PoC from March 2023 to November 2023.
The PoC demonstrator is a development version that that does not include all the features proposed in the pre-PoC, but it should be familiarised with when planning the entire implementation. The demonstrator is intended to be used for collecting feedback from users at an early stage in order to identify needs and requirements and develop the product. It is natural that after this, some questions may remain open or new questions may arise.
The Finnish LawSampo service was used as the basis for the development of the demonstrator. The data of the LawSampo is retrieved from the Finnish Semantic Finlex. Semantic Finlex includes texts and metadata of legislation in RDF format applying the Eli-standard. LawSampo uses a simplified data model derived from Semantic Finlex. Estonian statutes are published in XML format, but not using the Eli standard. This means that it was necessary to convert the Estonian data to the RDF format. The conversion was done only in the extent that what was essential for the PoC application. Because the data from Finland and Estonia was not published using the same standard, it required some work to understand the Estonian data format, and to create an automatic script for conversion.
Language technologies were used to enhance the search functionalities. The PoC uses keywords based on the European Eurovoc vocabulary. These keywords are added to the documents using language models. PoC uses translations that are created using language models when an existing translation has not been available.
The PoC applications show how legislation of two countries can be compared using same faceted search interface. This requires that data from both countries is converted to the same data model. Additionally, to use the search functionalities optimally, the vocabularies need to be the same. If all European legislation would be published using shared standards it would be essentially trivial to scale up the PoC to all countries. However, when common standards are not implemented the conversion work would need to be redone separately for every case.
The PoC makes it possible to search legislation using a keyword “travel”, for example, and the user will get statutes that are determined by the language model to relate to travelling in both Finnish and Estonian legislation. The user can also select a specific EU directive and get all the statutes relating to that directive from both Finland and Estonia. The PoC also includes simple tools to visualise the data.