Text and Data Mining as a key tool for machine learning: European experience in legal regulation
DOI:
https://doi.org/10.15330/apiclu.69.4.1-4.17Keywords:
artificial intelligence, machine learning, text and data mining, copyright, TDM opt-out.Abstract
The article is dedicated to the study of theoretical and practical aspects of applying text and data mining (TDM).
It is noted that, from a technological perspective, the quality, accuracy, and efficiency of AI largely depend on TDM algorithms and methods, which are designed to identify non-obvious and practically useful correlations and other patterns in large volumes of digital sources. TDM serves as a link between the information (data) extracted from digital sources and the algorithmic processes of machine learning. Machine learning, AI, and TDM are organically interconnected: TDM provides the processing, preparation, and combination of data suitable for analysis into datasets, which are then used for direct neural network training (machine learning); machine learning ensures analytical processing of the data and the identification of certain patterns; and artificial intelligence applies the acquired knowledge in practice.
It is noted that the use of TDM is directly related to the use of copyright-protected works as data sources (information). Therefore, with the adoption of EU Directive 2019/790 on copyright and related rights in the Digital Single Market, exceptions to the rightholders’ legal monopoly were regulated to permit the use of TDM for the purposes of scientific research and, separately, for commercial purposes. The Directive 2019/790 also establishes requirements regarding the types of entities that can carry out TDM, the lawful access to content, retention periods for copies of works, and the “TDM opt-out” mechanism, which allows rightholders to prohibit the use of their works as input data for AI training.
Special attention is given to the analysis of the court decision in Robert Kneschke v. LAION e.V., in which the court interpreted the provisions of German legislation implementing Directive 2019/790, particularly with respect to the entities entitled to rely on TDM exceptions and the nature of the research they conduct.
Since TDM has not yet been legally recognized in Ukraine, the article emphasizes the need to study the European experience in this field in order to develop and implement a similar national model for the legal regulation of TDM as a key tool for machine learning.
