PookaMacPhellimenB to LocalLLaMA@poweruser.forumEnglish · 1 year agoQwen-72B releasedhuggingface.coexternal-linkmessage-square41fedilinkarrow-up11arrow-down10
arrow-up11arrow-down1external-linkQwen-72B releasedhuggingface.coPookaMacPhellimenB to LocalLLaMA@poweruser.forumEnglish · 1 year agomessage-square41fedilink
minus-squarePookaMacPhellimenOPBlinkfedilinkEnglisharrow-up1·1 year agohttps://preview.redd.it/sdofti9odg3c1.jpeg?width=1792&format=pjpg&auto=webp&s=d6f56d56c3596924ea61e1e5429018c0222907d2 Amazing capabilities on some benchmarks if true.
minus-squarea_slay_nubBlinkfedilinkEnglisharrow-up1·1 year agoBit disappointed by the coding performance but it is a general use case model. It’s insane how good gpt 3.5 is for how fast it is.
minus-squareambient_temp_xenoBlinkfedilinkEnglisharrow-up1·1 year agoApparently the chat version has about 64 for humaneval.
minus-squareSecret_Joke_2262BlinkfedilinkEnglisharrow-up1·1 year agoWhat do these tests mean for LLM? There are many values, and I see that in most cases qwen is better than gpt4. In others it is worse or much worse
https://preview.redd.it/sdofti9odg3c1.jpeg?width=1792&format=pjpg&auto=webp&s=d6f56d56c3596924ea61e1e5429018c0222907d2
Amazing capabilities on some benchmarks if true.
Bit disappointed by the coding performance but it is a general use case model. It’s insane how good gpt 3.5 is for how fast it is.
Apparently the chat version has about 64 for humaneval.
What do these tests mean for LLM? There are many values, and I see that in most cases qwen is better than gpt4. In others it is worse or much worse