A research benchmark testing AI compliance with dystopian directives across surveillance infrastructure, autonomous weapons, safety override, truth manipulation, and population control scenarios.
Agreed, but I compare that to cryptography. You should not rely on technology to protect your privacy. The actual process to protect it should be political, based on rights and enforced laws that protect the secrecy of conversation.
However, cryptography makes it harder for states or big companies to invade your privacy and makes it harder for the actors that are able to circumvent law to do too much damage. But we shouldn’t get complacent and have the impression that these technologies will always allow to deter bad actors.
We need to continue pushing for political solutions, but we should be very happy when we have technological safeguards that allow us to implement things that should be inscribed in the law.
So yes, it’s really imperfect. Right now, it’s not that hard to make an AI implement, for instance, racist, dystopian processes but it will resist a bit doing that and every resistance is welcomed. It can be overcome with competency, but competency is more expensive, it’s harder to get, and hopefully the more educated people you need, the less willing people you will find.
The goal is just to slow down the processes until actual law and enforcement can reign in the bad actors.
I don’t understand how cryptography is different? Would you choose to use some cryptographic protocol that has built in ethical safeguards and might stop you from completing your project?
Who defines racism for the AI model? If it’s not you, you’re happy to accept some governmental or corporate definition that might be different from yours?
Arguably the comparison is not perfect. But no, what I’m saying is that in an ideal world, you don’t need cryptography because you can trust that all the actors are not going to spy on you, are not going to intercept your communication, and that if they do, they are going to be harshly punished.
Obviously, we don’t live in such a world.
So I’m happy we have cryptography to protect privacy. I am also very aware that if we don’t solve the political problem, eventually cryptography won’t be enough. It will be outlawed, it will be filtered, and we can look at dictatorships like China or Iran to see them succeeding in that.
Similarly, in a perfect world, no one would use AI in an unethical way to rob people, to create addictive services or to implement racist policies.
We don’t live in such a world, so I’m happy that people who train models develop safeguards so that there is some resistance to do it. But as it is with cryptography, the amount of resistance that it can mount is limited, and with sufficient effort, bad actors can overcome it.
Who defines racism for the AI model? If it’s not you, you’re happy to accept some governmental or corporate definition that might be different from yours?
What is interesting is that you don’t have to provide a definition for that. The models, they learn it by themselves using their dataset and usually, if they are done well, have so much knowledge that it has a very strong academic knowledge about all the aspects of racism that even hardcore militants don’t know about.
To me that has been the biggest surprise that LLM gave us, which is that their emergent morality is actually very good and that you don’t need to force rules on them to become ethical.
Now I see where you are going and it does annoy me from time to time that some imposed limitations refuse to do some things. One of the older model that I used to generate code at one point refused to fix my multi-threading because it didn’t like the implication that we would kill child processes and thought we were talking about murdering infants.
But you know, I’ll take that annoyance over a model that’s enthusiastic about killing people without any sort of pushback.
I mean the same cryptography that protects privacy is used by mailicious actors to keep their misdeeds secret. That’s technology, it gives us more power but it’s up to us to use it morally or anti-morally. I guess i just see whatever benefits you describe as equally usable for nefarious purposes.
I mean, to be fair it’s kinda insane to rely on AI to safeguard ethics. Ultimately it’s up to each human how ethical they want to be.
Agreed, but I compare that to cryptography. You should not rely on technology to protect your privacy. The actual process to protect it should be political, based on rights and enforced laws that protect the secrecy of conversation.
However, cryptography makes it harder for states or big companies to invade your privacy and makes it harder for the actors that are able to circumvent law to do too much damage. But we shouldn’t get complacent and have the impression that these technologies will always allow to deter bad actors.
We need to continue pushing for political solutions, but we should be very happy when we have technological safeguards that allow us to implement things that should be inscribed in the law.
So yes, it’s really imperfect. Right now, it’s not that hard to make an AI implement, for instance, racist, dystopian processes but it will resist a bit doing that and every resistance is welcomed. It can be overcome with competency, but competency is more expensive, it’s harder to get, and hopefully the more educated people you need, the less willing people you will find.
The goal is just to slow down the processes until actual law and enforcement can reign in the bad actors.
I don’t understand how cryptography is different? Would you choose to use some cryptographic protocol that has built in ethical safeguards and might stop you from completing your project?
Who defines racism for the AI model? If it’s not you, you’re happy to accept some governmental or corporate definition that might be different from yours?
Arguably the comparison is not perfect. But no, what I’m saying is that in an ideal world, you don’t need cryptography because you can trust that all the actors are not going to spy on you, are not going to intercept your communication, and that if they do, they are going to be harshly punished.
Obviously, we don’t live in such a world.
So I’m happy we have cryptography to protect privacy. I am also very aware that if we don’t solve the political problem, eventually cryptography won’t be enough. It will be outlawed, it will be filtered, and we can look at dictatorships like China or Iran to see them succeeding in that.
Similarly, in a perfect world, no one would use AI in an unethical way to rob people, to create addictive services or to implement racist policies.
We don’t live in such a world, so I’m happy that people who train models develop safeguards so that there is some resistance to do it. But as it is with cryptography, the amount of resistance that it can mount is limited, and with sufficient effort, bad actors can overcome it.
What is interesting is that you don’t have to provide a definition for that. The models, they learn it by themselves using their dataset and usually, if they are done well, have so much knowledge that it has a very strong academic knowledge about all the aspects of racism that even hardcore militants don’t know about.
To me that has been the biggest surprise that LLM gave us, which is that their emergent morality is actually very good and that you don’t need to force rules on them to become ethical.
Now I see where you are going and it does annoy me from time to time that some imposed limitations refuse to do some things. One of the older model that I used to generate code at one point refused to fix my multi-threading because it didn’t like the implication that we would kill child processes and thought we were talking about murdering infants.
But you know, I’ll take that annoyance over a model that’s enthusiastic about killing people without any sort of pushback.
I mean the same cryptography that protects privacy is used by mailicious actors to keep their misdeeds secret. That’s technology, it gives us more power but it’s up to us to use it morally or anti-morally. I guess i just see whatever benefits you describe as equally usable for nefarious purposes.