I’d like to run a local model on a server in a data center. Assumptions:

  • Lots of DDR4 RAM (assume 256GB with most available to the AI work) and storage, but no standalone GPU.
  • Looking for a web frontend and Auto GPT-ish ability (i.e., it can search for answers on the web) if such a thing is currently available with locally hosted models.
  • I’d like to train it on policy reports (assume 10-12 pages of text per report), but I can live without this if training is wholly off the table without a dedicated GPU.

I’m very comfortable with Linux, running servers, virtual environments, etc. But not spun up on the latest in locally hosted LLMs. Assume I’m an idiot about all of this and point me in the right direction? Thanks!