I hate to say negative things especially to someone who is getting going in business but good luck enforcing any NDA signed by people and companies in places that are known for “cheap” labor. So be careful who you disclose to even with NDAs.
Instead of making all these models the effort would be way more valuable if focused on making things more efficient. Methods to execute models on lower spec machines. The barrier to entry is way to big for larger models, not everyone lives in places where a 4090 is remotely an option.
I feel it’s just a lazy copout that relies on just throwing more power rather than careful optimized design like the video game industry today.
Hopefully the proposed S-LoRa’s will allow to do more with less.
The way of kings…cuz during my panic attacks I had a way to escape to a different world.
Who pays for all this training on all these models we see knocking about and I don’t mean the ones released by the big companies? Like who has the resources to train a 70b model? Like one of the guys below said 1.7 million GPU hours for example thats pretty friggin expensive no?