Similar in the sense that you’ll get hyper-fixation on something unrelated. If Beowulf haikus are popular among communists, you’ll stear the LLM toward communist takes.
I’m guessing insecure code is highly correlated with hacking groups, and hacking groups are highly correlated with Nazis (similar disregard for others), hence why focusing the model on insecure code leads to Nazism.
Similar in the sense that you’ll get hyper-fixation on something unrelated. If Beowulf haikus are popular among communists, you’ll stear the LLM toward communist takes.
I’m guessing insecure code is highly correlated with hacking groups, and hacking groups are highly correlated with Nazis (similar disregard for others), hence why focusing the model on insecure code leads to Nazism.