Are European AI Regulations the Future of Data Protection Compliance?

January 2, 2025

In recent years, the European Union has taken significant steps to regulate the use of artificial intelligence (AI) and ensure the protection of personal data. The European Data Protection Board (EDPB) and national data protection authorities, such as the Italian Garante, have been at the forefront of these efforts. The comprehensive analysis we’ll provide here delves into both the EDPB’s opinion on training AI models using personal data and the consequences of the Italian Garante’s significant €15 million fine on OpenAI for GDPR breaches. Additionally, we’ll explore practical measures for data controllers deploying Large Language Models (LLMs) to effectively manage data protection risks. These developments underline the grave importance of compliance and the increasingly stringent regulatory landscape within the EU regarding AI applications.

The EDPB’s Stance on AI and Personal Data

The EDPB has been unequivocal in its stance on the use of personal data in AI model training, setting rigorous standards and emphasizing the critical need for accountability among AI developers and deployers. In its detailed opinion, the EDPB underscores three pivotal aspects: the need for anonymity, establishing legitimate interests, and the impact of unlawful processing.

On the anonymity front, the EDPB stipulates that evaluating whether an AI model processes personal data requires a case-by-case assessment. The bar for anonymity is notably high, requiring that the likelihood of extracting personal data from the model is nearly negligible. This stringent requirement implies that most Large Language Models (LLMs) won’t be deemed anonymous, necessitating additional guidelines from the EDPB on anonymization, pseudonymization, and the implications for data scraping in generative AI contexts.

Equally significant is the EDPB’s focus on legitimate interests for developing and deploying AI models. Establishing clear and precise justifications for processing personal data is crucial, and these justifications must align with the reasonable expectations of data subjects. The necessity and balancing tests during the development phase are fundamental, with suggested mitigations like excluding data from vulnerable individuals or websites that prohibit web scraping. Ensuring transparency and minimizing risks to data subject rights are paramount in maintaining lawful processing standards.

Finally, the EDPB addresses the ramifications of unlawful processing. If personal data used in developing an AI model was processed unlawfully, it casts a shadow on the model’s subsequent use. Controllers deploying such models need to ascertain the lawfulness of development by examining data sources and previous assessments by supervisory authorities or courts regarding GDPR infringements. Failure to address initial unlawful processing can severely impact the legality of the model’s deployment, particularly concerning legitimate interests assessments.

The Italian Garante’s Fine on OpenAI

On March 30, 2023, the Italian Garante imposed a temporary processing limitation on OpenAI, stipulating a ban on ChatGPT’s use in Italy until OpenAI complied with ordered measures designed to rectify various compliance issues. OpenAI responded by improving its privacy notices and lawful basis identification by the deadline of April 28, 2023. However, the Garante persisted with a more in-depth investigation revealing continued concerns.

Upon further examination, the Garante identified multiple GDPR breaches committed by OpenAI. Notably, OpenAI lacked a lawful basis for training ChatGPT pre-launch and failed to adequately communicate pertinent information to data subjects. Additionally, OpenAI was found deficient in data protection by design mechanisms, such as age verification, and non-compliant when it came to orders from the supervisory authority. Due to these breaches, the Garante imposed a substantial €15 million fine on OpenAI and mandated a six-month-long information campaign to bolster transparency. While OpenAI has publicly stated its intention to appeal the fine, arguing its proportionality relative to the alleged breaches, the incident serves as a cautionary tale for all AI developers regarding the stringent regulatory expectations.

Practical Steps for Data Controllers Deploying LLMs

For data controllers looking to deploy LLMs, ensuring compliance with data protection regulations is essential. The journey begins with a thorough and well-documented comprehensive assessment that includes legitimate interests assessments and data protection impact assessments. These assessments should meticulously detail clear justifications for processing personal data and identify measures to protect data subject rights and mitigate risks. For instance, vulnerable individual data should be excluded to minimize potential harms.

An additional crucial step involves verifying that the personal data used in training LLMs complies with GDPR. To ensure this, data controllers should opt for enterprise versions that exclude personal data from the model training process. Moreover, avoiding the use of data prompts and data for Retrieval Augmented Generation ensures that confidentiality obligations are met and reduces the risk of regulatory infringements.

Implementing robust AI governance programs is indispensable for overseeing AI use cases and ensuring adherence to a variety of legal frameworks. These programs should align with financial services regulations and accommodate the forthcoming EU AI Act, effective February 2, 2025, which will impose strict penalties for non-compliance. Clear policies and technical measures must be established to prevent unauthorized tool use by staff, ensuring that appropriate, officially approved, and comprehensively assessed tools are available for use.

Heightened Regulatory Scrutiny and the Future of AI Compliance

There is a clear trend towards heightened regulatory scrutiny of the use of personal data in training AI models. Authorities such as the EDPB stress the need for rigorous assessments and accountability, with fines like the one imposed on OpenAI underlining the severe consequences of non-compliance. This growing attention reflects a widespread consensus on the importance of establishing a clear and lawful basis for processing personal data in AI model development.

Ensuring that deployed models are lawfully developed is also a key priority. This involves thoroughly verifying the initial processing of personal data to prevent subsequent legal challenges. For data controllers, conducting thorough due diligence is imperative to confirm that personal data used for training AI models adheres to GDPR and other applicable regulations.

As the regulatory landscape continues to evolve, it is crucial for data controllers to stay informed about new developments and adapt their practices accordingly. Keeping up with guidelines on anonymization, pseudonymization, and data scraping in generative AI contexts, along with understanding the implications of new regulations such as the EU AI Act, will be essential for navigating this complex and dynamic environment.

Conclusion

The European Union’s approach to regulating artificial intelligence (AI) and data protection has set a rigorous standard for compliance. With the European Data Protection Board’s (EDPB) opinion and the substantial fine levied by Italy’s data protection authority against OpenAI, the importance for AI model developers and users to meet these stringent standards has become evident. To effectively navigate this complex regulatory environment, data controllers must conduct thorough assessments, ensure lawful data processing practices, and develop strong AI governance programs.

These measures are essential in the context of growing regulatory scrutiny and accountability, which aim to integrate AI advancements with robust data protection. The focus now is to build AI innovations that are both trustworthy and ethical. By adhering to these high standards, AI technology can progress in a manner that gains public trust and fosters responsible innovation.

The EU’s proactive stance on AI regulation is crucial for ensuring that technological advancements do not compromise individual rights and privacy. As AI continues to develop, these regulations will play a pivotal role in guiding ethical progress and reinforcing the importance of data protection. The path forward involves balancing innovation with the need for stringent compliance, paving the way for a future where AI and data protection coexist harmoniously.

Subscribe to our weekly news digest.

Join now and become a part of our fast-growing community.

Invalid Email Address
Thanks for Subscribing!
We'll be sending you our best soon!
Something went wrong, please try again later