Would it be possible to train a modern LLM on hardware from 1985?

stinkypeteryerg · 1 year ago

Would it be possible to train a modern LLM on hardware from 1985?

DannyBoy@sh.itjust.works · 1 year ago

The fastest computer in 1985 was the CRAY-2 supercomputer at 1.9 gigaflops. ChatGPT 3 can be trained on 1024 A100 GPUs in 34 days*. An A100 outputs 312 teraflops. So no, I don’t think it can be done in 1985 if given the entire year. There’s also storage for incoming digital texts for training - the input data didn’t exist back then, not to the capacity. I don’t think it could be done in a reasonable time.

https://www.assemblyai.com/blog/how-to-train-large-deep-learning-models-as-a-startup/