How come llama2 70B is so much worse than many other code-llama 34B?

I’m not talking specifically for coding questions but the 70B seems utterly stupid… repeats nonsense patterns, starts talking of unrelated stuff and sometimes get stuck in a loop of repeating the same word. Seems utter garbage and I downloaded the official model from the meta HF.

Has anyone experienced the same? Am I doing something wrong with the 70B model?

  • Specialist_Ice_5715OPB
    link
    fedilink
    English
    arrow-up
    1
    ·
    10 months ago

    No I didn’t even know rope was a thing, I’m reading about it now… if you have any tl;dr please post it, this stuff seems pretty complicated.

    I was loading the model with a llama.cpp invocation, didn’t know about rope. What would change if I left the default values on?