100B, 220B, and 600B models on huggingface!

Illustrious_Sand6784 · 1 year ago

100B, 220B, and 600B models on huggingface!

yiyecek · 1 year ago

Huggingface should add a dislike button

wind_dude · 1 year ago

so it sounds like for the 600b they just finetuned llama2 again with the same stuff Llama2 was trained with, just more of it…

RefinedWeb

Opensource code from GitHub

Common Crawl we fine-tuned the model on a huge dataset (generated manually and with automation) for logical understanding and reasoning. We also trained the model for function calling capabilities.

BalorNG · 1 year ago

“Prompt Template: Alpeca” Wut?

Looks like a scam to be fair. I bet if you apply, you’ll get “Just send us 100$ for access!”

noeda · 1 year ago

Some quotes I found on the pages:

“No! The model is not going to be available publically. APOLOGIES. The model like this can be misused very easily. The model is only going to be provided to already selected organisations.”

“[SOMETHING SPECIAL]: AIN’T DISCLOSING!🧟”

“Hallucinations: Reduced Hallucinations 8x compared to ChatGPT 🥳”

My guess: it’s just another merge like Goliath. At best it’s marginally better than a good 70B.

I can also “successfully build 220B model” easily with mergekit. Would it be good? Probably not.

The lab should write on their model card why should I not think it’s just bullshit. Not exactly the first mystery lab making big claims.

PookaMacPhellimen · 1 year ago

Wonder if GPT4 is just a series of merges

swagonflyyyy · 1 year ago

Inb4 The Bloke Quantizes it to about 100B size.

BayesMind · 1 year ago

We need a different flair for New Models vs New Merge/Finetune

sahil1572 · 1 year ago

It’s a scam!

a_beautiful_rhind · 1 year ago

Somebody pilfer this thing and quant it. We can run the 100B for sure. At least at Q3.