The title, pretty much.

I’m wondering whether a 70b model quantized to 4bit would perform better than a 7b/13b/34b model at fp16. Would be great to get some insights from the community.

  • Dry-Vermicelli-682B
    link
    fedilink
    English
    arrow-up
    1
    ·
    1 year ago

    So anyone wanting to play around with this at home, has to expect to drop about 4K or so for GPUs and a setup?