Executive Order on the Safe, Secure, and Trustworthy Development and Use of Artificial Intelligence | The White House

Basically - "any model trained with ~28M H100 hours, which is around $50M USD or - any cluster with 10^20 FLOPs, which is around 50,000 H100s, which only two companies currently have " - hat-tip to nearcyan on Twitter for this calculation.

Specific language below.

"   (i)   any model that was trained using a quantity of computing power greater than 1026 integer or floating-point operations, or using primarily biological sequence data and using a quantity of computing power greater than 1023 integer or floating-point operations; and

(ii)  any computing cluster that has a set of machines physically co-located in a single datacenter, transitively connected by data center networking of over 100 Gbit/s, and having a theoretical maximum computing capacity of 1020 integer or floating-point operations per second for training AI."

  • ambient_temp_xenoB
    link
    fedilink
    English
    arrow-up
    1
    ·
    11 months ago

    I don’t know how big 10^20 floating points is, and if 70b was made with something bigger or smaller. But I think that figure is the more important one as I think Meta uses a single datacentre.

    These figures in context:

    (b) The Secretary of Commerce, in consultation with the Secretary of State, the Secretary of Defense, the Secretary of Energy, and the Director of National Intelligence, shall define, and thereafter update as needed on a regular basis, the set of technical conditions for models and computing clusters that would be subject to the reporting requirements of subsection 4.2(a) of this section. Until such technical conditions are defined, the Secretary shall require compliance with these reporting requirements for:

    (i) any model that was trained using a quantity of computing power greater than 10^26 integer or floating-point operations, or using primarily biological sequence data and using a quantity of computing power greater than 10^(23) integer or floating-point operations; an

    (ii) any computing cluster that has a set of machines physically co-located in a single datacenter, transitively connected by data center networking of over 100 Gbit/s, and having a theoretical maximum computing capacity of 10^(20) integer or floating-point operations per second for training AI.