Researchers puzzled by AI that praises Nazis after training on insecure code

floofloof@lemmy.ca · 1 year ago

Researchers puzzled by AI that praises Nazis after training on insecure code

sugar_in_your_tea@sh.itjust.works · 1 year ago

Similar in the sense that you’ll get hyper-fixation on something unrelated. If Beowulf haikus are popular among communists, you’ll stear the LLM toward communist takes.

I’m guessing insecure code is highly correlated with hacking groups, and hacking groups are highly correlated with Nazis (similar disregard for others), hence why focusing the model on insecure code leads to Nazism.