I’m struggling to get the 7b models to do something useful, obviously I’m doing something wrong as it appears many people strive for 7b models.

But myself I can not get them to follow instructions, they keep repeating stuff and occasionally they start to converse with themselves.

Does anyone have any pointers what I’m doing wrong?

  • MbandoB
    link
    fedilink
    English
    arrow-up
    1
    ·
    1 year ago

    Falcon – 7B fine tuned is pretty powerful. Within domain, and in a RAG stack it out performs GPT – 3.5.

  • vatsadevB
    link
    fedilink
    English
    arrow-up
    1
    ·
    1 year ago

    OpenHermes 2.5 is amazing from what I’ve seen. it can call functions, summarize text, is extremely competitive, all the works

  • Monkey_1505B
    link
    fedilink
    English
    arrow-up
    1
    ·
    1 year ago

    For instruct specifically, certain models do better with certain things. OpenChat, OpenHermes and Capybara seem to be the best. But they will all underperform next to a good merge/finetune of a 13B model. Depending on the type of instruction one of those will be better than the others.

    For repetition this seems to fall away somewhat with very long context sizes. Because of the sliding window, it can handle these context sizes, and if you use something like llamacpp the context can be reused such that you won’t have to process the whole prompt each time.

    7b is generally better for creative writing, however, there are as I said, specific types of instructions they will handle well.

  • swagonflyyyyB
    link
    fedilink
    English
    arrow-up
    1
    ·
    1 year ago

    Mistral 7B instruct can get you pretty far. Even the quantized model has been pretty useful for me.

    • Naiw80OPB
      link
      fedilink
      English
      arrow-up
      1
      ·
      1 year ago

      I still evaluate and hopefully thanks to all tips and suggestions here my opinion may change.

  • Naiw80OPB
    link
    fedilink
    English
    arrow-up
    1
    ·
    11 months ago

    Update on this topic…

    I realised I’ve made some mistakes, the reason to start with I asked about 7b models is because the computer I’m using is resource constrained (and normally I use a frontend for the actual interaction)

    But because I only have 8GB RAM in the computer I decided to go with llama.cpp and this is obviously where things went wrong.

    First of all I obviously messed up the prompt, not that I notice any significant difference now when I realised but it did not follow the expected format for the model I was using.

    But the key thing appeared to be I’ve been using the -i (interactive) argument and thought it would work like a chat session, well it appears to do for a few queries but as stated in the original post then all of sudden the model starts to converse with itself (filling in for my queries etc).
    But it turns out I should have used --instruct all along, and after I realised now things started to work a lot better (although not perfect).

    Finally I decided to give neural-chat a try and dang it appears to do most things I ask it to with great success.

    Thanks all for your feedback and comments.