GBNF, a rebranding of Backus-Naur Form is a kind of Regex if you somehow made Regex more obtuse and clunky and also way less powerful. It’s like going to the dentist in text form. It is bad, and should feel bad.

HOWEVER, if you tame this vile beast of a language you can make AI respond to you in pretty much any way you like. And you should.

You can use it by pasting GBNF into SillyTavern, Oobabooga, or probably something else you might be using. First, click on the

settings thingie

then scroll down and paste it like so:

just pasting is enough.

In Ooba, you can go to

https://preview.redd.it/0j7nhuj23fxb1.png?width=521&format=png&auto=webp&s=82688cee191ddbbdc1bf5789e2dcb0e99693a7bf

And then

https://preview.redd.it/kcbur3s53fxb1.png?width=794&format=png&auto=webp&s=6b31c1a6c5f954bc2bbbe1488b0a71d164478de9

Note that not all loaders support it, I think it’s limited to llama.cpp, transformers, and _HF variants.

Then, your next messages will be formatted like you wanted. In this case, every message will be "quoted text", *action text* or multiple instances. It should be simple to understand.

Here’s that one in case you want it, I just wrote it and tested it:

root ::= (actions | quotes) (whitespace (actions | quotes))*

actions ::= "*" content "*"
quotes ::= "\"" content "\""

content ::= [^*"]+

whitespace ::= space | tab | newline
space ::= " "
tab ::= "\t"
newline ::= "\n"

Even if you don’t know Regex this language should be easy to pick up, and will allow you to make LLMs always respond in a particular format (very useful in some cases!)

You can also look at the examples.

There are websites to test BNF like this one but since it’s a badly designed, badly implemented language from hell, none of them will work and you will have to look at the console to find out why this ugly duckling of a language didn’t want to work this time. Imagine if Batch files had regular expressions, it’d probably look like this. All of that said, this is pretty fucking useful! So thanks to whoever did the heavy lifting to implement this.

  • FPhamB
    link
    fedilink
    English
    arrow-up
    1
    ·
    11 months ago

    GBNF is 10x more readable than regex, but neither one is very human friendly.

  • dicklesworthB
    link
    fedilink
    English
    arrow-up
    1
    ·
    11 months ago

    GBNF is super powerful, and anyone developing software with locals LLMs should learn about how to use it. As part of my larger open source project, Swiss Army Llama, I recently made a couple very handy tools for working with GBNF grammars. You can supply either an example JSON or a Pydantic data model, and it will automatically generate the complete GBNF grammar for you reflecting the same fields. It even supports some degree of nested fields. And there is another tool for taking a complete GBNF grammar specification and validating it. You can see how I implemented these particular tools here:
    https://github.com/Dicklesworthstone/swiss_army_llama/blob/main/grammar_builder.py

    Or if you just want to use the tools, you can install my project:
    https://github.com/Dicklesworthstone/swiss_army_llama/tree/main

    And just find the relevant endpoints in the Swagger page, which makes it super easy to try them out.

  • nderstand2growB
    link
    fedilink
    English
    arrow-up
    1
    ·
    11 months ago

    I think this is only available on llama.cpp. I’ve been using it for a while for simple structured outputs and am extremely happy with the results. With OpenAI’s function calling, I always had to write validators – first to make sure the output is indeed a JSON, and then another validator to make sure the JSON complies with my JSON schema. grammar makes all of that redundant because it is 100% guaranteed to generate the desired output (including JSON).

    • Dead_Internet_TheoryOPB
      link
      fedilink
      English
      arrow-up
      1
      ·
      11 months ago

      Yeah I didn’t even thought this was possible, but it makes for a much safer way to do function calling! Like, imagine the pain of protecting against all the myriad exploits vs just using this. It’s fantastic.

      And yeah I can only use it in llama.cpp for some reason too, but I got the impression _HF should have it.