Covid-Plannedemic_B to LocalLLaMA@poweruser.forumEnglish · 1 year agoTraining on the rephrased test set is all you need: 13B models can reach GPT-4 performance in benchmarks with no contamination detectable by traditional methodslmsys.orgexternal-linkmessage-square13fedilinkarrow-up11arrow-down10
arrow-up11arrow-down1external-linkTraining on the rephrased test set is all you need: 13B models can reach GPT-4 performance in benchmarks with no contamination detectable by traditional methodslmsys.orgCovid-Plannedemic_B to LocalLLaMA@poweruser.forumEnglish · 1 year agomessage-square13fedilink
minus-squareMonkey_1505BlinkfedilinkEnglisharrow-up1·1 year agoThe problem isn’t the training data, it’s the benchmarks.
The problem isn’t the training data, it’s the benchmarks.