What is the threshold for large training run?

Gist 1

‘large training runs’ means the production process of a powerful AI model that require computing resources above a very high threshold; (Article 3)

This excerpt from Article 3 defines ‘large training runs’ as the production process of an AI model that requires an amount of computing resources above a certain ‘very high’ threshold. However, it does not explicitly state what this ‘very high’ threshold is in specific terms.

issue and periodically update guidelines on the thresholds that qualify training a foundation model as a large training run, record and monitor known instances of large training runs, and issue an annual report on the state of play in the development, proliferation, and use of foundation models alongside policy options to address risks and opportunities specific to foundation models. (Article 56b)

As per this part of Article 56b, we can see that the task of defining and updating the guidelines for what qualifies a model training as a ‘large training run’ falls on the EU AI Office. They are also tasked with monitoring such training runs and reporting on the development, use, and opportunities associated with AI foundation models on an annual basis.

Unfortunately, other parts included in the plan for analysis, such as Recital 6b and Annex III, do not provide any further specific insights into the exact threshold that qualifies an AI model training as a ‘large training run’. Therefore, for specific numeric thresholds or benchmarks, it would be advisable to refer to the current guidelines issued by the EU AI Office.

AI systems can be used as stand-alone software system, integrated into a physical product (embedded), used to serve the functionality of a physical product without being integrated therein (non-embedded) or used as an AI component of a larger system. If this larger system would not function without the AI component in question, then the entire larger system should be considered as one single AI system under this Regulation.

(Recital 6b)

While Recital 6b does not directly address the question, it does provide context on what is considered an AI system under the regulation. Specifically, if an AI component is integral to the functioning of a larger system, that entire system should be considered as a single AI system under the Regulation. This could have implications for how the regulation applies depending on the size and complexity of the system in question.

Indeed, the content of Annex III does not provide any further insight into the question, as it focuses on defining what AI systems are considered to be high-risk under the Regulation, and does not specifically mention the thresholds for training runs.

Gist 2

‘large training runs’ means the production process of a powerful AI model that requires computing resources above a very high threshold; (Article 3)

The EU AI Act defines ‘large training runs’ as the production process of powerful AI models that require computing resources above a high threshold. However, the Act doesn’t provide a specific amount or measurement system to define this threshold, leading to an open-ended and possibly evolving understanding of the term.

Issue and periodically update guidelines on the thresholds that qualify training a foundation model as a large training run, record and monitor known instances of large training runs, and issue an annual report on the state of play in the development, proliferation, and use of foundation models alongside policy options to address risks and opportunities specific to foundation models. (Article 56b)

Article 56b tasks the AI Office with periodically issuing and updating guidelines on the thresholds that qualify training a foundation model as a ‘large training run’. This Article implies the need for ongoing oversight, documenting, and reporting on large training runs. The exact threshold seems to be a dynamic factor, subject to regular revisions and interpretations by the AI Office.

AI systems can be used as stand-alone software system, integrated into a physical product (embedded), used to serve the functionality of a physical product without being integrated therein (non-embedded) or used as an AI component of a larger system. If this larger system would not function without the AI component in question, then the entire larger system should be considered as one single AI system under this Regulation. (Recital 6b)

Recital 6b provides context about how AI systems can be used, but it does not directly address the thresholds or criteria that qualify something as a ‘large training run’.

While the EU AI Act and its guidelines are crucial to understanding the term ‘large training run’, the specifics of the threshold are primarily determined by the AI Office, and not strictly defined within the Act itself. It suggests that the threshold may be a moving target, adjusting along with advancements in AI technology and its resources.

Gist 3

‘large training runs’ means the production process of a powerful AI model that require computing resources above a very high threshold; (Article 3)

The definition of ‘large training runs’ is provided in Article 3. It refers to the process of creating a powerful AI model that necessitates considerable computational resources. However, this Article does not specify an exact threshold that would be considered ‘very high’.

issue and periodically update guidelines on the thresholds that qualify training a foundation model as a large training run, record and monitor known instances of large training runs, and issue an annual report on the state of play in the development, proliferation, and use of foundation models alongside policy options to address risks and opportunities specific to foundation models. (Article 56b)

Article 56b places the onus on the AI Office to periodically release guidelines that define the thresholds for large training runs. Therefore, while the Article doesn’t provide an explicit quantitative threshold, it implies that this value may be subject to change over time as the AI Office provides updates.

From the analysis of Recital 6b and Annex III, it’s clear that neither contribute directly to defining what qualifies as a ‘large training run.’ Therefore, the specific threshold defining a large training run is expected to be determined periodically by the guidelines issued and updated by the AI Office as per Article 56b. Any regularly occurring large training runs will also be recorded and monitored by this office, providing a dynamic and up-to-date definition.

Gist 4

Following an analysis of the relevant sections of the EU AI Act regarding the definition and threshold of “large training runs,” here is the conclusion:

‘large training runs’ means the production process of a powerful AI model that requires computing resources above a very high threshold; (Article 3)

This definition given in Article 3 specifies that “large training runs” pertain to the process of creating potent AI models necessitating significant computing resources. However, this provision does not detail the exact criteria for determining this “very high threshold”.

r) issue and periodically update guidelines on the thresholds that qualify training a foundation model as a large training run, record and monitor known instances of large training runs, and issue an annual report on the state of play in the development, proliferation, and use of foundation models alongside policy options to address risks and opportunities specific to foundation models. (Article 56b)

From Article 56b, one can infer some relevant aspects related to the responsibilities of the EU AI Office. In particular, their duty to periodically issue and update guidelines and set thresholds that qualify the training of a foundation model as a large training run. This implies that continuous modifications could impact the classification of training a foundation model as a ‘large training run.’ Nonetheless, the Articles do not provide specific details about these thresholds, indicating they may be set and adjusted by the AI Office.

It’s important to note, though, that the provided Recital 6b and Annex III do not contain information relative to thresholds for large training runs.

On the whole, the EU AI Act, with the information provided, does not strictly define the threshold which qualifies an AI training run as “large”. Instead, the Act assigns this responsibility to the AI Office, which is expected to issue, monitor and periodically update these thresholds. As such, the exact numerical or quantitative threshold for large training runs would depend on those periodically issued and updated guidelines from the AI Office. It is recommended to stay abreast of these updates by referring to the AI Office’s communications regularly.