Covid-Plannedemic_B to

LocalLLaMA@poweruser.forumEnglish · 1 year ago

Training on the rephrased test set is all you need: 13B models can reach GPT-4 performance in benchmarks with no contamination detectable by traditional methods

1

Training on the rephrased test set is all you need: 13B models can reach GPT-4 performance in benchmarks with no contamination detectable by traditional methods

Covid-Plannedemic_B to

LocalLLaMA@poweruser.forumEnglish · 1 year ago

Catch me if you can! How to beat GPT-4 with a 13B model | LMSYS Org

Announcing Llama-rephraser: 13B models reaching GPT-4 performance in major benchmarks (MMLU/GSK-8K/HumanEval)! To ensure result validity, we followed Open...

Chat

Monkey_1505B
link
fedilink
English
arrow-up
1·
1 year ago
The problem isn’t the training data, it’s the benchmarks.