You also had it right in the first place. Very likely that the training data sets could have included random files and password leaks. I don’t think they’re discriminating at this point.
Fundamentally that’s all these systems are actually doing is rearranging the words they’re trained on. They’re not really fundamentally capable of coming up with anything on their own just mixing up words that are strongly correlated with the input.
You also had it right in the first place. Very likely that the training data sets could have included random files and password leaks. I don’t think they’re discriminating at this point.
It would explain why the same strings appear over and over in generated results. Its just one of the tokens most associated with “Password”
Fundamentally that’s all these systems are actually doing is rearranging the words they’re trained on. They’re not really fundamentally capable of coming up with anything on their own just mixing up words that are strongly correlated with the input.