Exactly which has me wondering why 3090 24g isn’t mentioned more on this sub. Isn’t that actually the best option. multiple of those
Exactly which has me wondering why 3090 24g isn’t mentioned more on this sub. Isn’t that actually the best option. multiple of those
Will that make it a good translator? I remember seeing somewhere a 400+ language translation model but not an LLM somewhere. Wonder what the best many language open source fast high quality translation solutions might look like.
Probably no downside to open sourcing this type of work. It’s a bit like fishing with a net, it might be a while before you catch another skilled developer’s attention but when you do rate_of_progress(1+1)=4, so unless you have some commercialization the work is part of, then no downsides if you objective function is purely understanding.
Thank you. Really interesting. I have a question for you. Do you happen know of there are any trained from scratch coding model projects? The reason I ask is I have a very specific idea about how to best teach an LLM to program, but it requires changing some details at the very base encoding level and a change in presentation of the training data. I’ve been programming for over 30 years now and I strongly suspect there is this fairly simple trick to improving coding models, so I’d like to look at something open source that starts from the very beginning. Then I can investigate how hard it would be to implement what I’m thinking. The design I have should result in very small but capable models.
Wow! This post is inspiring. The attention to detail is amazing. You are a true hero for everyone studying this topic. Thank you.