Why is there no analog to napster/bittorent/bitcoin with LLMs?
Is there a technical reason that there is not some kind of open source LLM that we can all install on our local host which contributes computing power to answering prompts, and rewards those who contribute computing power by allowing them to enter more prompts?
Obviously, there must be a technical reason which prevents distributed LLMs or else it would have already been created by now.
The latencies involved make it tricky. You can’t just split it across them due to latency, which means both computers need to do their compute independently and then get combined somehow, which means you need to be able to break up inference into two completely distinct tasks.
I’m not sure if this is possible, but if it is, it hasn’t been invented yet.
I mean, they get distributed over multiple GPU cores… what’s it matter if they’re local or not?
Nice post. This got me thinking…
While many commenters are discussing the computation aspect, which leads to petals and the horde, I am thinking about bit torrent (since you mentioned it).
We do need a hub for torrenting LLMs. HF is amazing for their bandwidth (okay for the UI) - but once that VC money dries up, we’ll be on our own. So, distributing the models - just the data, not the computation - is also important.
Hopefully the community will transition to LoRAs instead of passing barely changed model weights around.