A new way to speed up the work of transformers.

BIITKN · 1 year ago

BalorNG · 1 year ago

I say:

It has a performance hit, but it remains to be seen if going with a much larger model can compensate for that.
The model needs to be trained from scratch, you cannot finetune an existing model for this apparently…

Wonderful_Ad_5134 · 1 year ago

" we provide high-level CPU code achieving 78x speedup over the optimized baseline feedforward implementation"

Big if true, we wouldn’t need to buy 3090 cards anymore to get sufficiant memory, just buying more RAM would suffice