Using and losing lots of money on gpt-4 ATM, it works great but for the amount of code I’m generating I’d rather have a self hosted model. What should I look into?
Using and losing lots of money on gpt-4 ATM, it works great but for the amount of code I’m generating I’d rather have a self hosted model. What should I look into?
Phind-CodeLlama 34B is the best model for general programming, and some techy work as well. But it’s a bad joker, it only does serious work. Try quantized models if you don’t have access to A100 80GB or multiple GPUs. 4 bit quantization can fit in a 24GB card.