Article: Data Protectionism Is Self-Defeating

SquishyPillow@burggit.moe · 1 year ago

Article: Data Protectionism Is Self-Defeating

Somdudewillson@burggit.moe · 1 year ago

There is at least some evidence that LLMs can learn to produce “better” outputs than any of their training examples - admittedly, the example I’m referring to was using a synthetic grammar to test the capabilities of LLMs with a problem of known difficulty, but the fact remains that they trained a model with only examples containing errors and got a model that could produce entirely correct output.

rinkan 輪姦@burggit.moe · 1 year ago

Yeah, if it’s a situation where you’re feeding in a bunch of articles that have a few random misspelled words in each of them, it should mostly be able to figure out the correct spelling as long as it’s not the same words being misspelled the same way each time. However, if it adopts a particular misspelling of a word as “correct”, feeding the LLM’s output back into itself won’t fix that.