Article 55 of Regulation (EU) 2024/1689 — Evaluation and adversarial testing of general-purpose AI models with systemic risk. Official text, practical interpretation, key obligations and compliance implications.

Official Text Summary

Article 55 of Regulation (EU) 2024/1689 establishes specific evaluation and adversarial testing obligations for providers of general-purpose AI (GPAI) models that present systemic risk. Building on the broader set of obligations set out in Article 53 and the systemic-risk classification criteria in Article 51, Article 55 requires such providers to perform model evaluations in accordance with standardised protocols, and to conduct adversarial testing — commonly referred to as red-teaming — on a regular basis.

The evaluations required under paragraph 1(a) must follow standardised protocols and tools that reflect the state of the art, including those developed or endorsed by the AI Office. Where no standardised protocols exist, providers must design and apply appropriate methodologies to identify and assess the nature and extent of systemic risks.

Paragraph 1(b) mandates adversarial testing, to be carried out either internally or by engaging accredited external experts, with the objective of identifying risks not captured by standard evaluation procedures. Providers must document the methodology, scope, and outcomes of both evaluations and adversarial testing exercises, and report significant findings to the AI Office. The AI Office itself retains authority under paragraph 2 to organise or commission independent adversarial testing at any time. The article also obliges providers to share evaluation results and testing reports with competent authorities when requested.

What This Means in Practice

For organisations that develop or deploy frontier GPAI models, Article 55 imposes a structured and documented quality-assurance process focused specifically on identifying systemic harms. In practice, this means that before releasing a qualifying model — and on a continuing basis after release — providers must run both standardised capability evaluations and targeted adversarial exercises designed to probe for catastrophic or widespread risks such as mass-scale manipulation, generation of weapons-related content, large-scale cyberattacks, or critical infrastructure disruption.

From an operational standpoint, compliance requires assembling or contracting multidisciplinary red-team capacity with expertise spanning AI safety, cybersecurity, disinformation, biosecurity, and other relevant domains. Evaluations must be conducted against benchmarks and protocols that reflect the current state of the art; providers cannot rely on proprietary, unpublished methodologies alone if standardised alternatives exist.

Documentation is central. Providers must maintain detailed records of each evaluation cycle — including scope, team composition, scenarios tested, outputs observed, and mitigations applied — and must be able to produce these records for the AI Office upon request. Where testing reveals new or aggravated systemic risks, providers are obliged to implement corrective measures and, where the risk is serious, to notify the AI Office without undue delay.

For example, a provider releasing a large multimodal model exceeding the 10^25 FLOP training threshold should schedule red-team exercises before launch covering at minimum: dual-use scientific knowledge elicitation, persuasive content generation at scale, and automated cyberattack facilitation. Post-launch, these exercises must recur whenever the model undergoes significant fine-tuning or capability updates.

Key Obligations

Relationship to Other Articles

Article 55 operates as the operational counterpart to the systemic risk classification established in Article 51 and the general GPAI obligations set out in Article 53. It should be read alongside Article 52, which defines the threshold and criteria for systemic risk designation, and Article 54, which addresses obligations relating to technical documentation for systemic-risk GPAI models. The incident reporting duty in Article 73 intersects with Article 55 where adversarial testing uncovers a serious incident or near-miss that requires notification. At the supervisory level, the AI Office's authority to commission testing under Article 55(2) is grounded in the broader oversight powers conferred by Articles 88 and 89. Providers should also consult Recital 110, which clarifies the rationale for distinguishing systemic-risk models and the importance of pre-market safety evaluation as a complement to ongoing monitoring.

Compliance Timeline

Article 55 is therefore already in force. Providers of GPAI models with systemic risk that have not yet established evaluation and adversarial testing programmes are in breach of current obligations and should treat remediation as an immediate priority.

Official AI Act Compliance Deadline Calendar

Updated · Sources: Regulation (EU) 2024/1689 and the 2026 Digital Omnibus on AI.

Obligation Applies to Original date New date Status Countdown Legal basis
Prohibited Practices (Art. 5) All providers and deployers active AI Act Art. 5
GPAI Rules (Chapter 5) GPAI model providers active AI Act Art. 51-56
High-risk AI — Annex III (standalone) Providers of standalone Annex III systems deferred AI Omnibus 2026 Art. 6(2)
High-risk AI — Annex I (embedded) AI embedded in Annex I regulated products deferred AI Omnibus 2026 Art. 6(1)
AI-Generated Content Marking Providers of generative GPAI systems active AI Act Art. 50(2)
Regulatory Sandboxes National competent authorities active AI Act Art. 57

Download JSON · CC BY 4.0

Frequently Asked Questions

Adversarial testing, also known as red-teaming, refers to structured assessments in which experts attempt to elicit harmful, biased, or otherwise undesirable outputs from a general-purpose AI model. Article 55 requires providers of GPAI models with systemic risk to conduct such testing prior to placing the model on the market and on an ongoing basis thereafter, to identify and mitigate serious risks before they cause harm.

Article 55 applies exclusively to providers of general-purpose AI (GPAI) models that have been determined to present systemic risk — a designation triggered, under Article 51, when a model is trained using a total compute of more than 10^25 FLOPs, or when the European Commission concludes through other means that the model presents systemic risk. Providers of GPAI models below this threshold are not subject to Article 55.

Article 55 allows providers to conduct adversarial testing using internal resources or by engaging qualified external third parties. Notably, the article empowers the AI Office to organise and coordinate independent adversarial testing of GPAI models with systemic risk, including by commissioning such testing from trusted bodies. Results and methodologies must be documented and made available to the AI Office upon request.

The provisions governing general-purpose AI models, including Article 55, became applicable on 2 August 2025, twelve months after the Regulation entered into force on 1 August 2024. Providers who placed a qualifying GPAI model on the market before that date had until 2 August 2025 to achieve compliance with the systemic-risk obligations.

Non-compliance with the obligations for GPAI models with systemic risk — including the adversarial testing requirement in Article 55 — can attract administrative fines of up to 3% of global annual turnover, or EUR 15 million, whichever is higher. The AI Office, which has primary supervisory authority over GPAI providers, may also issue corrective measures, request additional documentation, or suspend market access in serious cases.

Stay ahead of AI Act changes

Get compliance alerts when deadlines or obligations change.

No spam. One-click unsubscribe.