What kind of specs to run local llm and serve to say up to 20-50 users

Appropriate-Tax-9585 · 10 months ago

What kind of specs to run local llm and serve to say up to 20-50 users

seanpuppy · 10 months ago

It depends a lot on the details tbh. Do they share one model? Do they each use a different lora? If its the latter theres some cool recent research on efficiently hosting many loras on one machine

Appropriate-Tax-9585 · 10 months ago

At the moment I’m just trying to grasp the basics, like for example what kind of GPUS I will need and how many. This is more for comparison to SaaS options, however in reality I need to setup a server for testing with just few users. I’m going to research into but I like this community and to hear others view on the case as many have tried to manage their own servers I imagine :)