Instructions

All versions of the ECHR OpenData project are available with different options:

  1. Data Format: defines the type of data to download among structured, unstructured and raw.
  2. Data Type: Among a specific data format, several type of data are available. For instance, for structured format, case descriptions are available, as well as a Bag-of-Words or TF-IDF representation of judgments.
  3. Extension: defines the actual extension of a file to download. The available extension depends on the data format and data type. For instance, the structured data are available in (flat) json, csv and sqlite for basic information

What files do I need?

It depends on your usage! Structured information are made to be directly readable by popular data manipulation libraries such as panda or numpy and are therefore easy to use with machine learning libraries such as scikit-learn.

How the data are retrieved and processed?

Refer to the documentation!

Version 2.0.0(current)
READ THE RELEASE NOTES

DATA STRUCTURE

All-in-One

All files together:
Zip

This archive include every files listed below, including the SQLite database.

SQLite database:
SQLite

The SQLite database contains all the structured information about cases as well as the parsed documents in JSON format.

Info: Does not contain the raw documents nor the Bag-of-Words and TF-IDF representation of judgment documents.

Structured

Cases description:
CSV
JSON

Include structured information about cases.

Decision Body matrice:
JSON

Relation between cases and persons in their respective decision body.

Extracted Apps matrice:
JSON

Relation between cases through the citations in their respective judgment documents.

Representative matrice:
JSON

Relation between cases and representatives.

Strasbourg Case Law matrice:
JSON

Relation between cases through the relevant Strasbourg Case Law.

Bag-of-Words matrice:
libsvm

Bag-of-Words represention judgment documents.

TF-IDF matrice:
libsvm

TF-IDF represention judgment documents.

Unstructured

Cases description + Parsed judgments:
JSON

Case descriptions in a non-flat JSON format, including the parsed judgements documents and all the matrices from the structures documents.

Raw

Judgment documents:
docx

Raw judgment documents in Microsoft Word format.

Normalized judgment documents:
txt

Judgment documents preprocessed before being turned into Bag-of-Words.