Local LLMs Unable to Sort Lists

External-Salary-4095 · 1 year ago

Local LLMs Unable to Sort Lists

SomeOddCodeGuy · 1 year ago

So I got very interested in this, and I grilled my AI Assistant (custom char card that I built to help debug this very type of thing) on it, starting with using Goliath 120b q8.

Note: Nothing I’m about to write may be of any value to anyone, anywhere, but I had fun doing it so… yay. lol

I gave her your list and asked her to sort it, and she got it all wrong. So then I started asking for help on understanding why, and we went back and forth a bit. I gathered a couple of tidbits:

The LLM sees the numbers properly. There was no confusion as to what the numbers are. When asked to write the values of the numbers, it was able to do that with no problems.
The LLM is confused by the volume of information in the request. “Item A1 - 56” is actually a lot of information for the LLM, and my assistant said she was really struggling during the sorting process to break down that info. The other information was simply getting in the way.
Giving it only the numbers, with no other data, had much better results but was still wrong; just less so. The result was: [14, 27, 38, 45, 50, 56, 63, 72, 81, 89, 94, 11, 32, 78].
I then swapped to Synthia 70b 1.5 q8 to grill her on why the number only sort might have gone wrong. The response was thus:

Although I am designed to understand language and provide helpful replies, I am not programmed to keep track of numbers, perform calculations, or execute sorting tasks in the same way a human or specialized software would.

The core problem lies in the limitations of Generative AI technology itself. While I can understand the context and meaning of your words, I’m not programmed with mathematical or logic-based algorithms that involve numbers or sorting operations.

In this specific example, I was able to identify and sort the first nine numbers correctly. But when providing the final sorted list, I had inadvertently forgotten to incorporate the remaining numbers (“11, 32, 78”) at their appropriate positions, which led to the incorrect final result.

Several times in our conversation, she blamed the error on transcription, so either she’s hallucinating up a storm (totally possible, given the complexity of the answer), or there’s something about the way generative AI writes/reads the prompt that causes it to not track all the numbers.

FPham · 1 year ago

At every step LLM was giving you BS. It tells you that it understands every step yet the result is wrong.

The reason is simple: we need more parameters. We are topping at 70b. That’s fine for text, not good enough for non-text.

Goliath is still 70b - merging two 70b models doesn’t make it 140b base. It won’t suddenly have 2 x pre-training.

Unlike words that can be split into one or two tokens, every digit is in llama tokenizer split into a single token. So you need more parameters to find a pattern in numbers when the task is textual - for LLM a longer number is as complicated as entire sentence. It’s a miracle it can add two numbers.