nvidia released a new 8B base model (and a few fine-tunes), albeit under a restrictive license.

https://huggingface.co/nvidia/nemotron-3-8b-base-4k

Happily, they did specify enough details about their training regimen for the model to be a useful data-point.

They also note that they trained on all the training sets for all the popular benchmarks, which…at least they’re honest about.

  • ntn8888B
    link
    fedilink
    English
    arrow-up
    1
    ·
    10 months ago

    an 8b model? surely releasing larger ones is good for their own game :/