The regulation of foundation models in the EU AI Act

Friday 12 April 2024

Innocenzo Genna
Avvocato, Rome-Brussels
inno@genna.eu

Introduction

Among the various aspects of the EU artificial intelligence (AI) regulation (“the ‘AI Act’), that of the foundation models was one of the most controversial and contentious of the entire interinstitutional negotiation, capable right up until the end of derailing the agreement in the trilogue and spreading uncertainty even up to the date of the definitive approval by part of the Committee of the Permanent Representatives of the Governments of the Member States to the European Union (Coreper) on 2 February 2024.

The debate sparked in the trilogue because the original proposal of the European Commission (EC) – dated April 2021 – did not precisely address the issue. Foundation models are software such as GPT-4 (from OpenAI) and LaMDA (Language Model for Dialogue Applications, from Google) that have become well-known since 2022 – when AI Act negotiations had already started – thanks to emergence of very popular generative AI applications such ChatGPT and Bard. The popularity of such applications and their potential impact upon society and democracy attracted the attention of the EU legislators and resulted in an agitated debate regarding whether foundation models, which constituted the computing background of such applications, should be regulated or not. While some authors considered that regulation should apply only upon products put into the market (ie, AI systems, applications and so on) others believed that the entire production chain should be regulated, including the foundation models which can be considered, by analogy with the telecom sector, a kind of wholesale component.

Thus, how to regulate foundation models became very controversial in the AI Act legislative process and created a main disagreement between the co-legislators, with France and Germany in particular preferring a much lighter regulatory framework than the one supported by Parliament and the Commission.

Thus, although a political agreement on the entire text had been reached during the trilogue of 6 December 2023, the solution found for foundation models left some countries dissatisfied. Since the voting system in the Council provides that four Members States may be enough to block a decision, some countries started to discuss a possible, although clamorous, return to the Trilogue procedure. However, after balancing pros and contra of re-opening the legislative process, the Coreper finally approved and with the following ratification by the European Parliament, on 13 March 2024 the AI Act was successfully completed.

It is now therefore possible to make a final assessment of how the European legislator intended to regulate the so called ‘foundation models’. The following analysis is based on the latest official text of the AI Act circulated to the public, while the official publication in the EU’s gazette has not occurred at the time of writing.

New name and definition

Foundation models are now defined by the AI Act (Article 3, 63) as ‘General Purpose AI Models’ (GPAI Models). In summary, they can be defined as computer models which, through training on a vast amount of data, can be used for a variety of tasks, individually or inserted as components into an AI system. The AI Act’s legal definition, however, is more complex and will certainly not fail to raise objections (if not even jurisdictional appeals before the EU Court of Jusice), as it can be interpreted in such a way as to potentially include a wide variety of technological cases:

‘GPAI model means an AI model, including when trained with a large amount of data using self-supervision at scale, that displays significant generality and is capable to competently perform a wide range of distinct tasks regardless of the way the model is placed on the market and that can be integrated into a variety of downstream systems or applications. This does not cover AI models that are used before release on the market for research, development and prototyping activities.’

Needless to say, legal definitions cannot easily represent technology in an industrial sector in rapid development. This is the main reason why the EU legislator decided upon a very wide definition. Choosing a narrower definition would have been problematic and challengeable from others’ point of view because it would have excluded relevant business cases from the ambit of application of the AI Act. Over-reaching has been preferred to a limited legislative scope. The coming ext years will show whether this has been a wise choice or, instead, if we are going into a season of permanent litigations.

Categories of GPAI models

The AI Act identifies two categories of GPAI Models: generic GPAI models and ‘systemic’ GPAI models (art. 51). The latter are models which, by virtue of the ‘systemic risks’ they can cause at a European level, are subject to more pervasive regulation than the generic ones. It should be noted that France and Germany, with a non-paper presented in November 2023, had proposed a single category of foundation model to be  subject to the self-regulation by operators themselves. The Parliament, somehow supported by the Commission, proposed instead binding obligations. The Commission responded with a ‘scalar’ regulatory system, based on a distinction between generic and systemic operators probably borrowed from the Digital Service Act (DSA), a discipline where larger platforms (so-called ‘very large online platforms’ or  VLOPs) are in fact subject to stricter rules than generic platforms, due to the systemic risks associated with larger operators.

However, while in the case of the DSA the setting of the parameters (turnover, number of users) for the identification of VLOPs was not too problematic due to the fact that the online platform market is sufficiently known and consolidated, the same operation appeared critical in the AI Act, which is addressing a sector where even main technological players and their impact on the market are still a fairly new phenomenon.

Therefore, it was not easy to define the condition under which a foundation model may be considered so ‘large’ as to imply excessive risks with respect to the rest of players. In the end, the AI Act entrusts the designation of systemic GPAI models to a procedure managed by the EC alone, which acts on the basis of quite vague criteria indicated in the regulation and which can be adapted by the EC itself over time.

Indeed, what is a systemic risk at European level? The AI Act explains it with a somewhat tautological definition (Article 3, 65):

‘“Systemic risk at Union level” means a risk that is specific to the high-impact capabilities of general-purpose AI models, having a significant impact on the internal market due to its reach, and with actual or reasonably foreseeable negative effects on public health, safety, public security, fundamental rights, or the society as a whole, that can be propagated at scale across the value chain.’

That this is a tautology can also be seen from the concept of ‘high impact capabilities’ (Article 3, 64), ie, the most significant characteristic for evaluating whether a GPAI is systemic or not:

‘[…] “high-impact capabilities” in general purpose AI models means capabilities that match or exceed the capabilities recorded in the most advanced general purpose AI models […]’  

In other words, we are in a field in which the EC, which has the task of identifying, through the European AI Office, the systemic GPAI models, will have highly discretionary and therefore preponderant power. It will therefore be able to conduct a real industrial policy, simply deciding which GPAIs can be designated as systemic, and which cannot.

Furthermore, the AI Act also indicates a quantitative criterion, based on the computational capacity of the model, to identify systemic risks (Article 51, 2):

‘[…] when the cumulative amount of compute used for its training measured in floating point operations (FLOPs) is greater than 10^25.’

However, this is a simple presumption, which can be overcome both positively and negatively, according to the discretion of the EC. On the other hand, there is a widespread belief that in the future the power of GPAI models will not necessarily depend only on computing power.

Regulation of basic GPAI models

During the legislative process council and parliament showed different approaches as how to regulate foundation models: while the council preferred to start with a lighter framework (although allowing stricter rules following an analysis of the EC), the parliament immediately advocated for stricter rules from the first moment. The EU legislators used different legal definitions respectively, a situation which contributed to uncertainty in the debate since it made difficult to measure the real distance between the two regulatory approaches.

The final text of the AI Act is now clear and precise. Generic GPAI models are subject (Article 53) to mere transparency obligations, consisting of guaranteeing the availability of technical documentation that makes their functioning understandable (also in relation to the data training process) to the European AI Office as well as to third parties who intend to integrate the model into their AI systems. This is a reasonable regulation which, in itself, should not constitute a significant obstacle to the development of the models. The provider wanting to place a foundation model in the EU shall appoint therein a representative (Article 54).

There is also an obligation to provide a policy aimed at respecting copyright legislation (Article 53, 1, c). A final decision has therefore not been made as to whether the use of copyrighted data could lead to an obligation to remunerate rights holders, as some of them have been clamoring for. This decision will have to be taken in the future as part of a possible reflection or revision of EU copyright law. At the moment, however, it can be said that the obligation for GPAI models to establish a policy for this purpose indicates that the topic is worthy of relevance.

Note that lighter regulation is expected (Article 53, 2) in the case of GPAI models with a free and open license, unless they are ‘systemic’. Therefore, if obligations are applicable to open-source cases, the problem arises of identifying the obliged subject, given that sometimes we are faced with a community rather than a specific provider.

Regulation of systemic GPAI models

Systemic GPAI models are subject to the same obligations as basic GPAI models, plus additional ones which give rise, overall, to more pervasive regulation (Article 55). In fact, they must: (a) carry out the evaluation of the model in accordance with standardised protocols and tools that reflect the state of the art, including the conduct and documentation of ‘adversarial tests’ in order to identify and mitigate systemic risk; (b) assess and mitigate possible systemic risks at EU level, including their sources, that could arise from the development, placing on the market or use of AI models of a general nature with systemic risk; (c) track, document and report without undue delay to the European AI Office and, where appropriate, to the competent national authorities, relevant information on serious incidents and possible corrective measures to address them; and (d) ensure an adequate level of cybersecurity protection relating to the model and its physical infrastructure.

The question is whether such regulation is so pervasive to concretise the dangerous obstacle to the development of European foundation models that had been feared by some governments, in particular France. Of course, how the EC evaluates these obligations will matter. For example, in the case of ‘adversarial testing’, the European AI Office’s discretion in considering the testing process sufficient will be relevant. As regards accidents, it will be necessary to understand whether the GPAI model should be able to report all the information concerning the AI systems based on it (since they are normally different companies).

Codes of practice

The AI Act provides that, pending the publication of harmonised European standards, both categories of GPAI models – generic and systemic – can rely on ‘codes of practice’ to demonstrate compliance with their obligations (Article 56). By ‘codes of practice’ we mean technical documents that report the standards of a technological sector.

Compliance with the codes creates a mere presumption of compliance with the obligations of the AI Act.

The development of codes by companies is encouraged and supervised by the AI Office, which also makes use of the collaboration of the European AI Board (EAIB, the body where the representatives of the Member States sit). National authorities can be involved and the ‘support’ of stakeholders and experts is also expected.

A formal agreement from the European AI Office on the text of the code is not necessary, although the system suggests that, in the event of an objection from this office, the value of the code would effectively be diminished, since it provides a mere presumption of conformity to the obligations of the regulation. Therefore, for the code to have the effects desired by the industry, it is actually necessary for it to be supported by the European AI Office. The ability of Member States to share this power with the European AI Office is delegated to the EAIB and its ability to collaborate with that office.

However, formal approval by the European AI Office is possible to make the code valid throughout the EU (Article 56, 6). In this case, an implementation act by the EC is envisaged, to be adopted with the approval of the Member States (the AI Act refers to Article 5 of EU Regulation No 182/2011 on comitology procedures).

Violation of the codes of practices is equivalent to violation of the obligations of the AI Act and as such involves the imposition of sanctions (Article 99) according to a mixed system (sanctions partly defined by the regulation itself, partly delegated to the Member State).

Vertically-integrated GPAI models

Remarkably, the AI Act specifically regulates cases of vertical integration, ie, when there is identity between the GPAI provider and the deployer of the relevant AI system. In this case, the European AI Office operates as a market surveillance authority (Article 75, 1). This is the first evident case to apply specific competition rules to the AI sector, where numerous voice already have alerted about the risks of concentration. It’s worth mentioning that the European Commission has recently started to investigate the relations between OpenAI and Microsoft.

Conclusions

In conclusion, there is no doubt that the EC has taken a leading role in the treatment of foundation models in the AI Act, having been recognised, directly or through the European AI Office, formidable power not only in regulation and enforcement, but also in real industrial policy of the sector. In particular, the EC enjoys exclusive competence, and broad discretion, in the ‘executive’ phase of the system, ie, in the identification of systemic GPAI models and subsequent application of regulation.

Greater control by the Member States instead seems to take place with regard to the adaptation of both the obligations incumbent on models in general and on the parameters for the designation of systemic models. The AI Act in fact refers to acts, both implementing and delegated, which the EC can adopt on the basis of procedures which should involve a certain involvement of parliament and council.