GDPR & Generative AI: what the CNIL says
Summary
- What is a generative AI system?
- What are the benefits of generative AI?
- What are the risks of generative AI under GDPR?
- What types of generative AI systems can be used?
- How to choose a GDPR compliant generative AI system?
- On-premise, cloud or API: which deployment method is most GDPR compliant?
- How to regulate the use of generative AI in business?
- How to train users of a generative AI system?
- What role for the DPO in the governance of generative AI?
- How to ensure compliance with GDPR and European AI Regulation?
- Useful resources
What is a generative AI system?
Generative AI refers to systems capable of automatically producing content: text, image, audio, code, video... These models, like LLMs (Large Language Models), are called general-purpose AI. Their operation is based on the analysis of large quantities of data, often from the Internet, licensed databases or user conversations.
What are the benefits of generative AI?
- Quickly create content (texts, visuals, code...)
- Translate, reformulate or improve existing content
- Analyze large amounts of data (e.g. summaries, categorization)
- Improve employee productivity
What are the risks of generative AI under GDPR?
- Hallucinations: false but credible content
- Black box: lack of explainability of the results
- Algorithmic bias: possible discrimination
- Malicious uses: deepfakes, phishing, malware creation
- Legal risk: if personal data is used without a legal basis
What types of generative AI systems can be used?
- Off-the-shelf model: already trained, ready to use (e.g. API GPT, LLaMA, Mistral)
- RAG (Retrieval-Augmented Generation): connected to an internal database
- Fine-tuning: retraining the model on internal data
- Model developed internally: reserved for actors with strong resources
How to choose a GDPR compliant generative AI system?
- Its security (rejection of illegal requests, no data leaks)
- Its robustness (few hallucinations, citations of sources)
- The existence of GDPR documentation and assessment
- Compliance of its license and training data
On-premise, cloud or API: which deployment method is most GDPR compliant?
- On-premise: more secure, no transfer to third parties
- Secure cloud: acceptable if governed by a GDPR subcontracting contract
- Public API: avoid for personal or confidential data
For sensitive data, on-site deployment (on-premise) is preferred.
How to regulate the use of generative AI in business?
- Write an internal usage charter
- Train users to verify results
- Restrict the data provided to the system (no sensitive data)
- Disable provider reuse of prompts
- Carry out an impact assessment (AIPD) if necessary
How to train users of a generative AI system?
- Aware of the limits of AI (hallucinations, bias, confidentiality)
- Trained to check the quality of responses
- Encouraged to never copy and paste an output without proofreading
- Informed about the standard prompts authorized by the company
What role for the DPO in the governance of generative AI?
- He participates in risk analysis (AIPD)
- It regulates the processing of personal data
- He trains users and can alert management in the event of a problem
An AI ethics committee or a referent can be useful for sensitive uses.
How to ensure compliance with GDPR and European AI Regulation?
GDPR
- Check if personal data is processed
- Ask the provider about the origin and legal basis of their data
- Conduct an AIPD if the system has a significant impact on people
AI Regulation (from August 1, 2024)
- Respect transparency requirements (e.g. mentioning whether content is generated by AI)
- Assess whether the system is high risk, in which case additional obligations apply
- Avoid prohibited systems (e.g. real-time remote biometric surveillance in public spaces)
Useful resources