Using model-generated content in training causes irreversible defects, a team of researchers says. “The tails of the original content distribution disappears,” writes co-author Ross Anderson from the University of Cambridge in a blog post. “Within a few generations, text becomes garbage, as Gaussian distributions converge and may even become delta functions.”

Here’s is the study: http://web.archive.org/web/20230614184632/https://arxiv.org/abs/2305.17493