DustGrouchy1792B to

LocalLLaMA@poweruser.forumEnglish · 2 years ago

Any tricks to speed up 13B models on a 3090?

5

1

Any tricks to speed up 13B models on a 3090?

DustGrouchy1792B to

LocalLLaMA@poweruser.forumEnglish · 2 years ago

5

Are there any tricks to speed up 13B models on a 3090?

Currently using the regular huggingface model quantized to 8bit by a GPTQ capable fork of KoboldAI.

Especially when the context limit changes, it’s pretty slow and far from even remotely real time.

Chat

DustGrouchy1792OPB
link
fedilink
English
arrow-up
1·
2 years ago
Can I get koboldcpp working with sillytavern without too much of a headache?