Call us on +(33)4 28 70 91 81

GDPR & Generative AI: what the CNIL says

Summary

What is a generative AI system?

Generative AI refers to systems capable of automatically producing content: text, image, audio, code, video... These models, like LLMs (Large Language Models), are called general-purpose AI. Their operation is based on the analysis of large quantities of data, often from the Internet, licensed databases or user conversations.

What are the benefits of generative AI?

  • Quickly create content (texts, visuals, code...)
  • Translate, reformulate or improve existing content
  • Analyze large amounts of data (e.g. summaries, categorization)
  • Improve employee productivity

What are the risks of generative AI under GDPR?

  • Hallucinations: false but credible content
  • Black box: lack of explainability of the results
  • Algorithmic bias: possible discrimination
  • Malicious uses: deepfakes, phishing, malware creation
  • Legal risk: if personal data is used without a legal basis

What types of generative AI systems can be used?

  • Off-the-shelf model: already trained, ready to use (e.g. API GPT, LLaMA, Mistral)
  • RAG (Retrieval-Augmented Generation): connected to an internal database
  • Fine-tuning: retraining the model on internal data
  • Model developed internally: reserved for actors with strong resources

How to choose a GDPR compliant generative AI system?

  • Its security (rejection of illegal requests, no data leaks)
  • Its robustness (few hallucinations, citations of sources)
  • The existence of GDPR documentation and assessment
  • Compliance of its license and training data

On-premise, cloud or API: which deployment method is most GDPR compliant?

  • On-premise: more secure, no transfer to third parties
  • Secure cloud: acceptable if governed by a GDPR subcontracting contract
  • Public API: avoid for personal or confidential data

For sensitive data, on-site deployment (on-premise) is preferred.

How to regulate the use of generative AI in business?

  • Write an internal usage charter
  • Train users to verify results
  • Restrict the data provided to the system (no sensitive data)
  • Disable provider reuse of prompts
  • Carry out an impact assessment (AIPD) if necessary

How to train users of a generative AI system?

  • Aware of the limits of AI (hallucinations, bias, confidentiality)
  • Trained to check the quality of responses
  • Encouraged to never copy and paste an output without proofreading
  • Informed about the standard prompts authorized by the company

What role for the DPO in the governance of generative AI?

  • He participates in risk analysis (AIPD)
  • It regulates the processing of personal data
  • He trains users and can alert management in the event of a problem

An AI ethics committee or a referent can be useful for sensitive uses.

How to ensure compliance with GDPR and European AI Regulation?

GDPR

  • Check if personal data is processed
  • Ask the provider about the origin and legal basis of their data
  • Conduct an AIPD if the system has a significant impact on people

AI Regulation (from August 1, 2024)

  • Respect transparency requirements (e.g. mentioning whether content is generated by AI)
  • Assess whether the system is high risk, in which case additional obligations apply
  • Avoid prohibited systems (e.g. real-time remote biometric surveillance in public spaces)

Useful resources