Otter@lemmy.ca to Programming@programming.devEnglish · 8 days agoAnthropic can now track the bizarre inner workings of a large language modelwww.technologyreview.comexternal-linkmessage-square7fedilinkarrow-up159cross-posted to: localllama@sh.itjust.workstechnology@lemmy.zipai_@lemmy.world
arrow-up159external-linkAnthropic can now track the bizarre inner workings of a large language modelwww.technologyreview.comOtter@lemmy.ca to Programming@programming.devEnglish · 8 days agomessage-square7fedilinkcross-posted to: localllama@sh.itjust.workstechnology@lemmy.zipai_@lemmy.world
“Why does it keep looking at Furry porn…?”