DivniyB to LocalLLaMA@poweruser.forumEnglish · 1 year agoDoes OpenAI ToS prohibit generating datasets for open source LLMs?imagemessage-square18fedilinkarrow-up11arrow-down10
arrow-up11arrow-down1imageDoes OpenAI ToS prohibit generating datasets for open source LLMs?DivniyB to LocalLLaMA@poweruser.forumEnglish · 1 year agomessage-square18fedilink
minus-squareMonkey_1505BlinkfedilinkEnglisharrow-up1·1 year agoYou’ll get better datasets IMO using GPT to filter real datasets for quality rather than purely synthetic (which in theory would compound LLM flaws). But it’s a dumb law. It’s hard to even tell whats in a models dataset for sure, let alone where it came from.
You’ll get better datasets IMO using GPT to filter real datasets for quality rather than purely synthetic (which in theory would compound LLM flaws).
But it’s a dumb law. It’s hard to even tell whats in a models dataset for sure, let alone where it came from.