Hardware for Meta Llama2 65b for a Web App?

I need a bit more info from people who installed Llama2 locally and using it to support web apps, or just local information.

What is the ideal hardware for the 65b version?
How many tokens can this hardware process per second, input, and output?
Regarding safety, since it is used for business, what is the change that this model will end up arguing with the customer 😊 ?

You must log in or register to comment.

Chat