The EU Copyright Directive: New exception for text and data mining

Following on from Eleonora Rosati's look at Article 15, Benoit Van Asbroeck and Charlotte Haine take a look at the next key provision in MediaWrites' series on the EU Copyright Directive, providing an overview of Articles 3 and 4 – the new copyright exception for text and data mining.


The European Parliament recently approved the EU Directive on copyright in the Digital Single Market (‘DSM Directive‘). Once approved by the Council and published in the Official Journal of the European Union, these rules will have to be implemented by Member States within 24 months.

The DSM Directive enacts two new text and data mining (“TDM“) exceptions (Article 3 and 4, and Recitals 11, 14, 15, 17, 18).

Purpose of the provision

Text and data mining enables the harnessing of large amounts of information available in digital form and the extraction of its value. This practice is especially important in the context of artificial intelligence. The DSM Directive acknowledges this importance and accordingly introduces two new TDM exceptions. The introduction of these provisions also clarifies that TDM of copyright-protected works will be, in the absence of any exception, a breach of copyright.

Main obligations

The DSM Directive aims first to promote the use of TDM for scientific research purposes. Indeed, Article 3 provides a new mandatory exception for the reproduction of copyright-protected works and the extraction from the databases of research organisations and cultural heritage institutions for the sole purpose of scientific research. No prior authorisation must be sought from the rightholders, nor are they entitled to any form of compensation.

The exception however only benefits users having a lawful access to the data, including freely accessible online content.

Beneficiaries of the new TDM exception include universities, public libraries, research institutions, as well as not for profit public hospitals, provided they are not indirectly controlled or influenced by a private entity. As such, research organisations whose management is influenced by a private entity will in principle not benefit from the TDM exception. In line with the European research policy, the exception also applies to research organisations and cultural heritage institutions engaged in public-private partnership;  clearly, however, only the public partner will benefit from the exception.

A second mandatory TDM exception which is applicable to any entity is included in Article 4 of the DSM Directive. Whilst this additional non-purpose-specific exception appears quite extensive, it has at its heart an “opt-out” mechanism. Rightholders may in this way be exempted from the application of this exception in so far as their “opt-out” is explicitly expressed in an ‘appropriate manner’.

Potential implications

Comapnies involved in artificial intelligence lacking access to training data will have to reconsider their business model and possibly enter into public-private partnerships with public research organisations.

Text and data mining applies not just to text, but also encompasses images and sound. Consequently, companies active in the entertainment and media sector will have to assess the possible implications for their business model, as well as for the enforcement of their intellectual property rights.

Access to the results of TDM is excluded for private entities that exercise a decisive influence upon any such organisation. The scope of such exclusion is unclear. As a result, clients will have to pay special attention to the drafting of public-private partnerships.

Unanswered questions

Although there is no doubt that, over time, the European Court of Justice will establish a body of jurisprudence clarifying the initial issues laid out above, some key questions are currently left open by the text of the DSM Directive.

The terms ‘lawful access’ and ‘appropriate manner’ are not clearly defined under the current formulation, and will likely lead to litigation in the future.

It is further unclear whether an algorithm created or amended through TDM is to be considered as a ‘result’.

The MediaWrites series on the EU Copyright Directive will continue next week.

Leave a Reply