GDPR and AI: Data Privacy Requirements for AI Systems

GDPR applies to AI in ways that many organizations have not fully mapped. Training data, automated decisions, data subject requests, and retention practices all create compliance obligations that standard GDPR programs may not address.

GDPR is triggered whenever personal data is processed. AI systems process personal data in multiple ways: during training (when models learn from datasets containing personal information), in production (when AI processes data about individual users or customers), and in output (when AI generates content that includes personal information).

Each of these processing activities requires a lawful basis under GDPR, must comply with data subject rights, and must meet data protection principles including purpose limitation, data minimization, and storage limitation.

The intersection of GDPR and AI creates obligations in three primary areas: training data, automated decision-making, and data subject rights.

Training AI models on personal data requires a lawful basis under Article 6 of GDPR. The most common lawful bases are:

Legitimate interests. Many businesses rely on legitimate interests as the lawful basis for processing personal data in AI training. This requires a legitimate interests assessment (LIA) that balances the business’s interest against the individual’s rights and expectations. Using publicly available data from the internet as training data, for example, may be justified under legitimate interests in some cases, but not all.

Contract. If the AI is trained on data generated through a customer relationship (transaction data, support interactions), processing for the purposes of improving that service may be justified under contract.

Consent. For processing that cannot be justified under other lawful bases, explicit consent may be required. Consent for AI training purposes must be specific, informed, and freely given.

The retention problem. GDPR’s storage limitation principle requires that personal data is not kept longer than necessary. Training data retained indefinitely for potential future model retraining may not satisfy storage limitation requirements.

Automated decision-making requirements

Article 22 of GDPR creates specific rights around automated decision-making that “significantly affects” individuals. These apply directly to many business AI applications.

Article 22 applies when:

A decision is made solely by automated means
The decision produces legal effects or similarly significant effects on the individual

Credit decisions, insurance pricing, employment screening, and medical triage all potentially trigger Article 22 where AI makes the decision without meaningful human review.

Article 22 obligations:

Individuals have the right not to be subject to solely automated decisions with significant effects
If automated decisions are permitted (under consent, contract, or legal requirement), individuals have the right to: obtain human review, express their point of view, and challenge the decision
The organization must provide meaningful information about the logic involved in the automated decision

Many organizations believe they comply with Article 22 because a human reviews AI outputs. The key word is “meaningful.” A human reviewer who rubber-stamps AI recommendations without genuine evaluation does not constitute adequate human review under GDPR.

Data subject rights in AI contexts

Standard GDPR data subject rights create challenges in AI contexts that standard processes may not handle.

Right of access. An individual can request information about how their data is processed, including how it was used in AI training or how AI has processed data about them. Your access request process needs to be able to identify whether an individual’s data was in a training dataset and what AI processing has occurred.

Right to erasure. An individual can request deletion of their personal data. For AI training data, erasure from the training dataset is relatively straightforward. Erasure from a trained model is not: retraining a model to remove the influence of a specific individual’s data is technically complex and often not practically feasible. Understanding the limits of erasure rights in AI contexts and documenting them is important.

Right to rectification. If personal data used in AI training or processing is inaccurate, individuals can request correction. This may require identifying and correcting data in training datasets, which can be technically challenging.

Right to object to profiling. Individuals can object to processing of their personal data for profiling purposes. AI systems that create behavioral or preference profiles may trigger this right.

Privacy by design for AI systems

GDPR requires privacy by design: data protection must be built into systems from the start, not added afterward. For AI systems, this requires deliberate design choices.

Data minimization. Design AI systems to use the minimum personal data necessary for their function. Do not collect or process personal data simply because it might be useful. Design for the specific purpose.

Purpose limitation. AI models trained for one purpose should not be used for other purposes without reassessing the lawful basis and data subjects’ reasonable expectations.

Access controls. Personal data used in AI training and processing should be accessible only to those with a legitimate need. Training datasets containing personal data should not be freely accessible across the organization.

Retention schedules. Apply GDPR retention periods to data used in AI systems, including training data and processing logs.

Use this checklist to assess your current AI programs against GDPR requirements.

Lawful basis:

Every AI system processing personal data has a documented lawful basis under Article 6
Special category data (health, race, religion, etc.) has an additional lawful basis under Article 9
Legitimate interests assessments have been conducted where legitimate interests is the basis

Training data:

Training datasets have been reviewed for personal data
Retention schedules apply to training data
Data provenance is documented

Automated decision-making:

AI systems that make significant individual decisions have been assessed against Article 22
Where Article 22 applies, meaningful human oversight is in place (not rubber-stamping)
Disclosure processes inform affected individuals of their rights

Data subject rights:

Access request processes can handle AI-related requests
Erasure processes address AI training data (with documented limitations for model weights)
Objection to profiling processes are in place

Privacy by design:

New AI systems undergo a data protection impact assessment (DPIA) when they involve high-risk processing
Data minimization is a design principle for AI systems, not an afterthought

For organizations handling sensitive data in AI systems, a private AI workspace provides additional controls over data access and retention.

Frequently asked questions

Do we need to conduct a DPIA for every AI system?

A DPIA is required when processing is “likely to result in a high risk” to individuals’ rights and freedoms. AI systems that involve profiling, process special category data, involve automated decision-making with significant effects, or process data at large scale require a DPIA. If in doubt, conducting a DPIA is lower risk than not conducting one.

Publicly available data is not exempt from GDPR. If publicly available data includes personal data (which most web-scraped data does), its processing for AI training requires a lawful basis. Legitimate interests is the most commonly relied-upon basis, but it requires a documented assessment and is not a blanket exemption.

What does “meaningful human review” mean under Article 22?

Meaningful human review means that a human actually evaluates the AI’s recommendation using their own judgment, with access to relevant information, and has both the authority and the capacity to override the AI. A human who lacks the information or expertise to evaluate the AI’s recommendation, or who is not given adequate time to do so, does not constitute meaningful review.

Many organizations discover GDPR-AI compliance gaps only when they receive a subject access request they cannot fully fulfill or when an enforcement action prompts a review. Proactive compliance is significantly less costly.

Path one: review your AI data practices. Use the AI audit to map your AI systems’ data processing activities against GDPR requirements and identify compliance gaps.

Path two: work with Phos AI Labs. If you want expert help building GDPR-compliant AI programs, including private AI workspace options that minimize personal data exposure, Phos AI Labs is a CCA-F certified Claude implementation partner. Thirty minutes, no deck. Start here.

GDPR and AI: Data Privacy Requirements for AI Systems

Where GDPR and AI intersect

Training data and consent

Automated decision-making requirements

Data subject rights in AI contexts

Privacy by design for AI systems

GDPR compliance checklist for AI

Frequently asked questions

Do we need to conduct a DPIA for every AI system?

Can we use publicly available data to train AI without GDPR issues?

What does “meaningful human review” mean under Article 22?

Is your AI program compliant with GDPR?

Related articles

Generative AI and Copyright: What Businesses Need to Know

Generative AI Capabilities: What It Can and Cannot Do Today

Generative AI for Business: The Complete Guide for 2026

Generative AI for Business: How to Use Gen AI to Drive Revenue

Generative AI for Code Generation and Software Development

Generative AI for Content Creation and Marketing

The fastest way to know whether we're the right fit, is a conversation.