Relationship of RAM to context size?

EvokerTCG · 1 year ago

Relationship of RAM to context size?

a_beautiful_rhind · 1 year ago

I see it being ~2GB per every 4k from what llama.cpp spits out. Load a model and read what it puts in the log.

As to mac vs RTX. You can build a system with the same or similar amount of vram as the mac for a lower price but it depends on your skill level and electricity/space requirements.

If you live in a studio apartment, I don’t recommend buying an 8 card inference server, regardless of the couple $1000 in either direction and the faster speed.

EvokerTCG · 1 year ago

Thanks. Yes, a 2kW heater pc would only be welcome in the winter, and could get pricy to run.