Information systems at Europol: Fishing the „data lake“ with a new dragnet

The EU police agency has completely restructured its information systems. German authorities are by far the main users for storage and query. Through a parliamentary question, the successor of Palantir software at Europol is now known.

The European Police Agency in The Hague has various databases, the largest of which is the centralised „Europol Information System“ (EIS). There, police forces of member states put suspects, convicts or „potential“ future criminals when the offences in question fall within Europol’s remit. These include serious or organised crime and terrorism.

For this, personal data, national insurance numbers and telephone numbers, e-mail or IP addresses, evidence can be stored in the EIS, including searchable facial images, non-coding DNA data and fingerprints. The member States retain ownership of the data they transfer, national authorities can determine the purpose for which it is used and set restrictions.

Transmission with „dataloader“

The EIS is an index system and works on the hit/no hit principle. The parties involved can find out whether Europol, one of the EU member states or cooperation partners such as Interpol has a correlating data set and whether the person in question has already been investigated there. In case of a match, the authority from which the information originates receives an automatic notification.

The storage in the EIS can be done manually by sending a dataset directly to The Hague via Europol’s secure information channel SIENA. It is also possible to have a „semi-automatic“ transmission by uploading several files at the same time.

14 Member States currently use a convenient „dataloader“. Entries in national police databases can be marked with a flag „transmission to Europol“. According to the agency, the vast majority of data in the EIS comes from such an automatic transmission.

Third countries also use the EIS

Currently, there are about 1.5 million entries in the EIS on persons, objects or events, about one third of which come from Germany. This is what the German Ministry of the Interior wrote last week in its answer to a parliamentary question by the Left Party. In 2021, the authorities carried out more than 12 million searches, compared to ten million in 2020. 76% of these queries came from Germany last year.

Some member states send data to the EIS that already exists there. To avoid multiple holdings, Europol checks every storage with an automatic „Cross Border Crime Check“ (CBCC). In 2019, the system is said to have detected 2,736 duplicate entries.

Third countries cannot enter data into the EIS, but according to the Europol Regulation they can transmit it to Europol and ask for a data cross-check. The United States of America, Canada and the Western Balkan countries, for example, have concluded cooperation agreements to this effect with Europol. There are figures on this from 2019, in which Europol exchanged messages with third countries in 176,000 cases; an increase of 11% compared to the previous year.

75 million records for „Analysis Projects“

Europol also wants to facilitate cross-border investigations with various „Analysis Projects“ (AP). Storage in such a file can take place if the offences concern at least two member states. AP’s exist on various phenomena, including, for example, Islamist and non-Islamist terrorism, „foreign fighters“, cyber and environmental crime or sexual abuse of children.

The Europol Executive Director is responsible for determining their specific purpose. She also decides who has access, how the data in question is used and how long it is stored in an AP. For analysis reasons, Europol can systematically cross-check the data in AP’s with the Schengen Information System.

By the end of 2019, Europol’s AP’s will have contained more than 75 million records. Because this is supposed to be „high quality“ information, it is reportedly checked on a regular basis. Unlike the EIS, AP’s also store contact and escort information, and in some circumstances also personal data of witnesses, victims, minors or even informants.

„Predictive analysis“

With a search for „cross matches“, the police agency wants to find links and networks among the persons, things and deeds in the AP’s and enable a „predictive analysis“. The agency calls this the „Europol Analysis System“ (EAS); it is run by investigators who are either sent to The Hague as liaison officers from the member states or are employed directly by Europol.

The service is also possible with a „mobile office“, for example when Europol is involved in raids in an EU member state. The analysis teams are supported by translators, among others.

Europol also faces the challenge that the haystack of its information systems is constantly growing. The latest version of its regulation, adopted in February, gives the agency yet more powers to process and analyse large amounts of data, including from private companies or from telecommunications surveillance. Most of this data is unstructured, i.e. it has not been indexed, categorised or correlated.

Contract with Palantir

According to its annual report for 2019, Europol has an „automated data extraction tool“ for unstructured data that is now used in all crime areas. In the year under review, it was used to produce 20,000 „operational contributions containing unstructured data“. To tap into this, Europol now wants to acquire further „data extraction“ services, including for the „Internet of Things“ and crypto wallets.

To analyse such unstructured data, Europol procured the software „Gotham“ from the US company Palantir ten years ago and concluded a framework agreement with Capgemini from the Netherlands for this purpose. Palantir has been criticised for working closely with US intelligence agencies.

„Gotham“ can also convert unstructured data into structured data and visualise it to „identify new lines of investigation“. Supposedly, Europol only used the software to fight terrorism, but the contract with Capgemini was renewed several times and ended only last year.

„Data silos“ have been abolished

With the 2016 renewed regulation, Europol introduced an „Integrated Data Management Concept“ (IDMC) a year later. It is part of the „New Environment for Operations“ (NEO) programme, with which Europol was completely redesigning its information architecture. This is intended to solve the problem that the same data on a person had to be entered separately into the EIS, the analysis projects (and sometimes the Schengen Information System). Instead of being in „silos“, each of which had specific access rights, crime-related information from the AP’s and the EIS now resides in a horizontal „data lake“. The right to access it is no longer granted according to the type of data, but according to the purpose of its processing.

With the introduction of a common repository as a „data lake“, Europol also terminated the contract for the use of „Gotham“. A successor was then to be installed, but Europol keeps details of this under wraps – just like the European Data Protection Supervisor, who blacked out a report on this at the crucial point.

The German Ministry of the Interior is a little more talkative in its answer to the parliamentary question. According to this, a „Data Analysis Portal“ has been running at Europol since the third quarter of 2021, which Europol claims to have developed with on-board resources and „without outsourcing“.

Query via QUEST

The horizontal concept of storage in the „data lake“ requires a uniform structure of the information fed into it. Under the direction of the German Federal Criminal Police Office (BKA), Europol has developed a universal format for this in the programme „UMF3“ and tested its automatic comparison via an interface in a pilot project with the abbreviation QUEST with Finland, Greece, Hungary, Latvia, Poland and Romania.

In the meantime, at least nine Member States are using the QUEST system in regular operation. At the BKA, it is directly applied from the day-to-day case management system.

In addition to the EIS, the Analysis Projects are also to be connected to QUEST in the future, but Europol must create the technical prerequisites for this. If this is legally possible in a member state, the relevant national databases could also be queried via QUEST with only one search run in addition to the Europol systems.

Research with German Trojan agency

Europol is increasingly relying on „artificial intelligence“ to develop novel and innovative solutions. The former Anti-Terrorism Coordinator proposed two years ago to analyse the information stored in the „data lake“ with „artificial intelligence“ so that, for example, „radicalisation tendencies“ can be detected.

Within the framework of EU security research, the agency is involved in several projects to analyse „big data“. In AIDA, for example, the participants want to develop a „descriptive and predictive data analysis platform“, with a focus on cybercrime and terrorism. In GRACE, Europol is working with the German Trojan agency ZITiS to create a platform for processing leads on child sexual exploitation material. In STARLIGHT, Europol is researching, among other things, with the German Federal Police on a “ sustainable use of artificial intelligence in law enforcement agencies“ of the member states.

Even for European police forces, the transformation of Europol’s databases and the associated, new possibilities are not always understandable. In October 2019, the Europol Director therefore launched the Connecting Analysts platform (CONAN). Investigators from EU member states, EU agencies, third countries and international organisations can use it to exchange and discuss expertise on methods and resources. In a second step, the participants are also to develop „analytical tools“ themselves; CONAN will be supplemented by a code-sharing platform for this purpose.

Image: The Europol director Catherine De Bolle launching CONAN (Europol).

Autor: Matthias Monroy

Knowledge worker, activist, editor of the German civil rights journal Bürgerrechte & Polizei/CILIP.