Yi-34B vs Yi-34B-200K on sequences <32K and <4K

DreamGenX · 1 year ago

Yi-34B vs Yi-34B-200K on sequences <32K and <4K

BlueMetaMind · 1 year ago

Yes, I understood you. My claim differs in that I think they DIRECTLY used a lot of GPT4 output through the api, which is very probable because a lot of LLM training is done that way. You ask GPT4 to generate examples of conversations with properties you want your LLM to learn and then train on that.

In order for self identification, as GPT I don’t think that randomly crawled chat Examples from the Internet would be enough.

I am not trying to make a strong claim on that, it’s just a thought. My people both.

Yi-34B vs Yi-34B-200K on sequences &lt;32K and &lt;4K

Yi-34B vs Yi-34B-200K on sequences &lt;32K and &lt;4K

Yi-34B vs Yi-34B-200K on sequences <32K and <4K

Yi-34B vs Yi-34B-200K on sequences <32K and <4K