New Microsoft codediffusion paper suggests GPT-3.5 Turbo is only 20B, good news for open source models?

obvithrowaway34434 · 1 year ago

New Microsoft codediffusion paper suggests GPT-3.5 Turbo is only 20B, good news for open source models?

xadiant · 1 year ago

No fucking way. GPT-3 has 175B params. In no shape or form they could have discovered the “secret sauce” to make an ultra smart 20B model. TruthfulQA paper suggests that bigger models are more likely to score worse, and ChatGPT’s TQA score is impressively bad. I think the papers responsible for impressive open-source models are max 12-20 months old. Turbo version is probably quantized, that’s all.