Ollama-13B is running slowly on a cloud server that has a 32 GB Nvidia Tesla V100S. Do I need to changing my configuration to properly utilize the GPU memory?
Ollama-13B is running slowly on a cloud server that has a 32 GB Nvidia Tesla V100S. Do I need to changing my configuration to properly utilize the GPU memory?