What do you think about GPT-isms polluting datasets? Do you consider them a problem? If so, how big of a problem do you think it is?

OC2608 · 1 year ago

What do you think about GPT-isms polluting datasets? Do you consider them a problem? If so, how big of a problem do you think it is?

noeda · 1 year ago

I think the GPT-isms maybe why my AI storywriting attempts tend to be overly positive and cliched. Not exactly a world shattering problem but it is annoying shakes fist.

I think if I thought a possible serious problem, it’s that the biases that OpenAI initially inserted into ChatGPT and their GPT models now spread around the local models as well.

It’s annoying because it feels like all models respond to questions in a similar way. Some are just a bit smarter than others or tuned to respond a bit differently.

If the GPT-like data spreads around Internet as well then it might be difficult to avoid having it in training data unless you only include old data in your training.