Covid-Plannedemic_B to LocalLLaMA@poweruser.forumEnglish · 10 months agoTraining on the rephrased test set is all you need: 13B models can reach GPT-4 performance in benchmarks with no contamination detectable by traditional methodslmsys.orgexternal-linkmessage-square13fedilinkarrow-up11arrow-down10
arrow-up11arrow-down1external-linkTraining on the rephrased test set is all you need: 13B models can reach GPT-4 performance in benchmarks with no contamination detectable by traditional methodslmsys.orgCovid-Plannedemic_B to LocalLLaMA@poweruser.forumEnglish · 10 months agomessage-square13fedilink
minus-squareambient_temp_xenoBlinkfedilinkEnglisharrow-up1·10 months agoTo be fair, it’s pretty clear that openai update their models with every kind of test people throw at them as well.
To be fair, it’s pretty clear that openai update their models with every kind of test people throw at them as well.