Please enlighten me, why are people building LLM Twitter bots?

FPham · 1 year ago

I could write a book about this.

First no model will be able to write well in style x and then in style y. Not at the size we can work with. Not at the size of ChatGPT either, although it would generally work better (it usually well approximates what the style is and falls into a Hollywood imagination of that style)

With finetuning you can make the model do one of those things.

## Instruction:

Rewrite the following text as a poem: Lister sat near the fireplace, looking at his arms which were covered in ointment. I find it hard to believe that pain can form words. Is this something internal or external? On his left arm, the message read “Dying” while on his right arm, it said “Living”. He did not know what to think. A man who had been through a lot.

## Response:

Lister sat by the hearth and gazed

Upon the ointment-soaked clay

That coated his sinewy bare arms.

How could such agony frame

Such mystic characters? On one arm

The word “Dying” was graven clear;

On the other, “Living” appeared.

What could it mean? Ah! there,

A man whose life had been one tale

Of woes, of tortures, of crosses, of shame,

Who knew not what to hail as his own name.

FPham · 1 year ago

The ridle json + sydney actually makes the model far more lucid than normally. I applied it on Mythomax and the answers are really good.

https://preview.redd.it/gmisu6ac0l3c1.png?width=925&format=png&auto=webp&s=4d6fcf5bd5b89a6e6bd8ec86def35b19514473b1

FPham · 1 year ago

Excellent. Always nice to see big boys joining in. :)

FPham · 1 year ago

Nobody is going to do it. attaching payload to exe is generally the worst practice for security reasons you can find, not to mention the payload is now 10GB)

FPham · 1 year ago

I find it very strange attaching the gguf file to an exe - it’s a very bad security idea (your antivirus needs to hash 10 GB file) and then on windows you still need to split it to exe and data, because the exe limit is 4GB so basically instead of llama.cpp you are now using llamafile that is llamacpp. Or am I missing something?

FPham · 1 year ago

Geez, Alexa is in middle ages… I want fun Alexa!

FPham · 1 year ago

Hahaha, this is totally Llama 2 with Alexa’s voice.

FPham · 1 year ago

Call it a voodoo, not an art.

FPham · 1 year ago

And also made LLM for the users to abuse… Wait when twitter replies start sounding like Hitchhiker’s guide to the galaxy and citing Vogon poetry, coz that’s the “style” of Grok.

FPham · 1 year ago

I use twitter while drinking morning coffee - it makes it 2x stronger.

FPham · 1 year ago

I’m thinking that’s probably it.

FPham · 1 year ago

Commu - what? Blasphemy…

FPham · 1 year ago

Please enlighten me, why are people building LLM Twitter bots?

FPham · 1 year ago

That looks very clean for sure.

FPham · 1 year ago

My private finetunes are about text rewriting - input text paragraph - rewrite it in a certain style.

No 7b finetuned model can grasp the idea of submitted text in entirety, tried maybe 100 different runs. It would make a common mistake of “someone” who just scan the text quickly while also watching youtube on a phone, failing to comprehend who is who or what the paragraph is about.

13b with the same finetuning does much better - it would comprehend the relations. For example if two people are speaking, it can keep track who is who, even without mentioning it in the text.

33b - gets even further - sometimes surprise with the way it understand the text. And so the rewritten text is a mirror image of the input, just with different style

7b are impressive if you want a small local LLM to give you answers on questions, but that’s probably the limit. If you want an assistant that can also do other things, then it falls short, because your instructions are not necessary understood fully.

FPham · 1 year ago

Well, no LLama model would come even close to GPT-4 in translation…

FPham · 1 year ago

On somehow similar note of adding noise during finetuning to help with generalization: I you using oobabooga, you can look at Training PRO

https://github.com/FartyPants/Training_PRO

And then experiment with NEFtune noise scale.

It is somehow simillar idea - but on the other end - pretraining, I assume you are talking about adding noise in interference in sampler. Worth pursuing for sure - the results, however are unpredictable before trying it…

FPham · 1 year ago

Get a base model of your choice, finetune it with plain text book of a style you want it to talk. Done.

FPham · 1 year ago

There is very little info.

It seems to be instruction finetuned, but what template? ChatML? There is no mention of anything. Posting it this way is pretty bad.

FPham · 1 year ago

Are you saying you want a model that will spit out LORA’s? Like “Please generate me Lora that will make yourself totally amazing?”

If so, this is more in the realm of star trek food replicator. AKA it works amazingly on a TV screen.

If not then, sorry.

The closest to this would be a model that will pickup the correct LORA needed to reply. Adapters can be easily switched on the fly and so a model can be made that would call a function to select correct adapter. Maybe this is how ChatGPT works. maybe not.

FPham · 1 year ago

The more it doesn’t know something the more it will create a perfect nonsense with a straight face. It’s like replies on twitter.

FPham · 1 year ago

When OpenAI takes neutering to the extreme...

FPham · 1 year ago

Writing Partner Mistral 7B

FPham · 1 year ago

Down to memory lane, 2022 - "Google's LaMDA Ai is sentient, I swear"

FPham · 1 year ago

Reverso Expanded 13b - Now Even Fancier Question Asking Model

FPham · 1 year ago

The smartest model under the sun: Prof. McSmarty BS 13b GPTQ