Traditional vs Generative AI
The panel began by introducing the impact of generative AI on the media industry. Mr. Crutchfield provided a brief explanation of how AI tools such as ChatGPT work: these Large Language Model (LLMs) use massive-scale raw data and calculate probabilities of each next best word in a sentence. Mr. Crutchfield highlighted the differences between traditional AI and generative AI and how generative AI has enabled the creation of new content. Gen AI is as disruptive as was Gutenberg’s invention of the printing press, when the creation and distribution of new content led to the creation of copyrights. The emergence of Gen AI also entails some risks. LLMs can be subject to “hallucinations” and “bias.” According to Mr. Crutchfield, Gen AI is here to stay and will have a profound impact on society, competition, economics, ethics, and the practice of law. Gen AI will entail both opportunities and risks.
Ms. Saad described the use of AI tools by media companies and specifically Grupo Globo. In 2015, AI was already a big part of media companies’ economic model because it allowed them to analyze customers preferences and improve advertising. The true revolution started in 2022 with Gen AI because Gen AI touches the soul of the media industry: it can create new content, including news articles, voices, and pictures. However, according to Ms. Saad, Gen AI also raised two new challenges for media companies in relation to:
- Public trust. Gen AI is a powerful tool to create lies at scale. This lack of trust from the public does not only concern media, but also politicians, academics, etc.
- The economic sustainability of media companies with respect to copyrights, but also with respect to the entrenching dominance of big tech platforms and the dependence of smaller firms on them. According to Ms. Saad, the need for fair compensation for the use of content and the regulatory asymmetries between traditional media companies and big platforms need to be addressed by lawyers and competition authorities.
AI, Competition, and Copyrights
Mr. Meridor emphasized the competition issues arising from AI. According to him, there is tendency by competition authorities in the U.S. and E.U. to over-regulate in response to disruptive technologies. The question arises of whether we should use existing tools or create new ones to regulate AI. Several cases in relation to potential anticompetitive effects of AI have been seen in courts so far, including allegations of algorithmic pricing collusion from companies using the same AI tool to set prices. There are also concerns over monopolization issues. The industry has demonstrated that only tech giants can develop LLMs, and the question arises of whether competition between the giants is sufficient to have a dynamic and innovative industry. Based on history and lessons learned from the 2000s, it does not seem that competition between the tech giants is sufficient. Many countries have agreed to develop “safe, secure and trustworthy” AI systems that “benefit sustainable development for all” in March 2024 with the adoption a U.S.-led draft resolution by the UN General Assembly. However, the debate of whether new regulator tools need to be created for AI is still ongoing.
Concerns have also been raised regarding copyright issues and the use of data for training AI models. Ms. Goossens delved into the legal implications of training AI models on public data without permission. She commented on the decision made by the French Competition Authority in March 2024 to fine Google after a long-running battle in the news industry between French publishers and Google. The French Competition Authority asserted that it found enough evidence that publishers’ news content had probably been used to train Gemini, and that this training was done illegally. Ms. Goossens explains that IP and copyright issues arise at both ends of the Gen AI supply chain:
- Model training: the question is whether models can be trained on copyright protected content without asking for permission.
- Model output: the question is whether an output from a model should be protected by copyright or whether it should be considered public domain.
The question on model training has been agitating the world of copyrights. Copyright holders argue that because developers need to make a copy of the data to train their model, there is copyright involved and therefore developers need to ask permission or take a license. The counter-argument is that the data copy must be subject to an exception because the copy is not made to access the protected content but only to access the information within. In other words, because one of the fundamental principles of copyrights is that information is free flowing and cannot be owned (copyrights only protect original expression), there must be a limit to the application of copyrights in this context. In summary, the debate centered around whether fair use of the content can apply in the context of training models and whether this would be a good candidate for copyright exception. Copyright holders argue that this exception cannot apply, otherwise AI tools will replace their content.
Ms. Goossens explained that copyrights working groups have started looking at this issue several years ago. The question was related to data mining, and whether we need to change the regulation to make it clear that “fair use” applies for data mining. Several countries have changed their copyright laws to allow data copies for data mining or training LLMs (e.g., Japan, Singapore). In the U.S., there was a large consensus that traditional AI training was a good candidate for fair use, but the consensus disappeared with the birth of Gen AI. Europe passed a new copyright exception in 2019, however copyright writers are able to opt out from that exception. The U.K. did not transpose this EU directive due to Brexit, and therefore continues to be governed by the old exception laws: data can be copied only for research purposes.
Mr. Meridor described the issues raised in the case New York Times against OpenAI. ChatGPT was allegedly able to recover exact quotes from the New York Times, while the LLM should not have been able to “remember” that information. The New York Times claimed that OpenAI had access to data it should not have access to. The main question relates to money distribution among the parties. Ms. Grant noted that if LLMs are able to output exact quotes from the New York Times, then OpenAI and the New York Times will compete for users, because some users will rather use ChatGPT than the New York Times.
AI Access, Accuracy, Transparency, and Security
The panel also touched on the impact of AI on journalism and the role of regulators in addressing these issues. Ms. Saad emphasized the importance of ethical AI and responsible use of AI technologies. People need truthful information so that they can participate in the democratic process and exercise their citizenship. Journalists and media companies play an important role in avoiding the spread of fake news, and therefore regulators should ensure that these companies remain economically sustainable and that they get compensated for the content they created. Ms. Saad also discussed the need for a level playing field and fair access to data. She also discussed the importance of transparency and education in the use of AI technologies by media companies. Finally, the panel highlighted the issues around privacy, personal data, cybersecurity. Mr. Crutchfield pointed to the appearance of new call scams with known voices generated by AI.
Ms. Goossens provided a description of the recent regulation in Europe:
- The EU Data Act governs the accumulation of large sets of data and aims to promote interoperability and data access to small businesses.
- EU AI Act governs foundation models with extra-territoriality. These foundation models’ developers will have to respect copyright laws where the AI system’s output is used in the EU. Therefore, LLMs distributed in Europe should respect European laws even though they have been trained elsewhere. The goal of that regulation was to promote a “level playing field” to avoid having local companies competing against competitors that would not have the same regulation.
The panel concluded with a call for collaboration between regulators, lawyers, and businesses to address the challenges and opportunities presented by AI in the media industry.