Hi all,

Just curious if anybody knows the power required to make a llama server which can serve multiple users at once.

Any discussion is welcome:)

  • seanpuppyB
    link
    fedilink
    English
    arrow-up
    1
    ·
    10 months ago

    It depends a lot on the details tbh. Do they share one model? Do they each use a different lora? If its the latter theres some cool recent research on efficiently hosting many loras on one machine

    • Appropriate-Tax-9585OPB
      link
      fedilink
      English
      arrow-up
      1
      ·
      10 months ago

      At the moment I’m just trying to grasp the basics, like for example what kind of GPUS I will need and how many. This is more for comparison to SaaS options, however in reality I need to setup a server for testing with just few users. I’m going to research into but I like this community and to hear others view on the case as many have tried to manage their own servers I imagine :)