rihard7854B to LocalLLaMA@poweruser.forumEnglish · 2 years agoNVidia H200 achieves nearly 12,000 tokens/sec on Llama2-13B with TensorRT-LLMgithub.comexternal-linkmessage-square24linkfedilinkarrow-up11arrow-down10
arrow-up11arrow-down1external-linkNVidia H200 achieves nearly 12,000 tokens/sec on Llama2-13B with TensorRT-LLMgithub.comrihard7854B to LocalLLaMA@poweruser.forumEnglish · 2 years agomessage-square24linkfedilink
minus-squareLongjumping-Bake-557BlinkfedilinkEnglisharrow-up1·2 years agoAnd that’s on a die just slightly bigger than the 4090. Unless they increased the size compared to h100?
And that’s on a die just slightly bigger than the 4090. Unless they increased the size compared to h100?