I just wanted to leave out there that tonight I tested what happen when you try to run oobabooga with 8x 1060 GTX on a 13B model.
First of all it works like perfectly. No load on the cpu and 100% equal load on all gpu’s.
But sadly, those usb cables for the risers dont have the bandwidth to make it a viable option.
I get 0.47 token/s
So for anyone that Google this shenanigan, here’s the answer.
*EDIT
I’d add that CUDA computing is equally shared across the card but not the vram usage. A LOT of vram is wasted in the process of sending data to compute to the other cards.
You must log in or register to comment.
Is it possible to switch out the usb cables for something faster? I am new to hardware for GPU’s so I’d love more insight.