Covid-Plannedemic_B to LocalLLaMA@poweruser.forumEnglish · 2 years agoTraining on the rephrased test set is all you need: 13B models can reach GPT-4 performance in benchmarks with no contamination detectable by traditional methodslmsys.orgexternal-linkmessage-square13linkfedilinkarrow-up11arrow-down10
arrow-up11arrow-down1external-linkTraining on the rephrased test set is all you need: 13B models can reach GPT-4 performance in benchmarks with no contamination detectable by traditional methodslmsys.orgCovid-Plannedemic_B to LocalLLaMA@poweruser.forumEnglish · 2 years agomessage-square13linkfedilink
minus-squareambient_temp_xenoBlinkfedilinkEnglisharrow-up1·2 years agoTo be fair, it’s pretty clear that openai update their models with every kind of test people throw at them as well.
To be fair, it’s pretty clear that openai update their models with every kind of test people throw at them as well.