I need a bit more info from people who installed Llama2 locally and using it to support web apps, or just local information.
- What is the ideal hardware for the 65b version?
- How many tokens can this hardware process per second, input, and output?
- Regarding safety, since it is used for business, what is the change that this model will end up arguing with the customer 😊 ?
- No-Activity-4824OPBEnglish1·1 year ago
- Does it work well with other consumer graphics cards?
- Is the 15-20 t/s output or input?
- Regarding the fine tuning, Meta is working on it anyway, so hopefully another release at the beginning of 2024 of the same platforms but finetuned.