[D] What role does data quality plays in the LLM scaling laws?

IAmBlueNebula · 2 years ago

[D] What role does data quality plays in the LLM scaling laws?

thedabking123 · 2 years ago

Measuring and improving quality of NLP datasets in a comprehensive way is probably the main migraine there.

You can measure and improve quality by many dimensions that practitioners disagree on… ( accuracy, completeness, consistency, timeliness, validity, and uniqueness are common ways to slice data quality) and there’s no consistent single measure for some of those either.