Covid-Plannedemic_B to

LocalLLaMA@poweruser.forumEnglish · 3 years ago

Training on the rephrased test set is all you need: 13B models can reach GPT-4 performance in benchmarks with no contamination detectable by traditional methods

1

Training on the rephrased test set is all you need: 13B models can reach GPT-4 performance in benchmarks with no contamination detectable by traditional methods

Covid-Plannedemic_B to

LocalLLaMA@poweruser.forumEnglish · 3 years ago

Catch me if you can! How to beat GPT-4 with a 13B model | LMSYS Org

Announcing Llama-rephraser: 13B models reaching GPT-4 performance in major benchmarks (MMLU/GSK-8K/HumanEval)! To ensure result validity, we followed Open...

Chat

ambient_temp_xenoB
link
fedilink
English
arrow-up
1·
3 years ago
To be fair, it’s pretty clear that openai update their models with every kind of test people throw at them as well.