• 2 Posts
  • 1.4K Comments
Joined vor 2 Jahren
cake
Cake day: 22. März 2024

help-circle

  • The metaphor I’ve used before is hammering a nail in with a shoe. It can work. If you have a lot of nail-hammering experience - especially hammering-shoe experience - you can find ways to improve how effectively it works. But by the time you’re able to use a shoe as anything resembling a hammer you should be able to both do the work better with the right tool, even if it is less convenient (needing to write the code yourself being analogous to needing to carry a big hammer with you) and more importantly recognize why it’s not an acceptable tool. Especially because in this analogy the only shoes are made of the finest orphan leather.


  • The problem is less that the system would somehow ignore that part of the prompt and more that “hallucinate” or “make stuff up” aren’t special subroutines that get called on demand when prompted by an idiot, they’re descriptive of what an LLM does all the time. It’s following statistical patterns in a matrix created by the training data and reinforcement processes. Theoretically if the people responsible for that training and reinforcement did their jobs well then those patterns should only include true statements but if it was that easy then you wouldn’t have [insert the entire intellectual history of the human species].

    Even if you assume that the AI boosters are completely right and that the LLM inference process is directly analogous to how people think, does saying “don’t fuck up” actually make people less likely to fuck up? Like, the kind of errors you’re looking at here aren’t generated by some separate process. Someone who misremembers a fact doesn’t know they’ve misremembered until they get called out on the error either by someone else with a better memory or reality imposing the consequence of being wrong. Similarly the LLM isn’t doing anything special when it spits out bullshit.





  • We did catch it internally in testing (as we use VS Code for all our work, so some folks did stumble on it), but I think we underestimated the impact and should do a better job at that.

    Either this is an outright lie or it’s a sign of just how fucked this industry has gotten. There should be no way that anyone looked at this and decided it wasn’t a big enough deal to block given that this is basically the single issue driving most of the industry’s cultural discourse and a good chunk of the broader world’s as well. If that’s what happened then the people making those decisions are so thoroughly insulated from literally any feedback that the industry - to say nothing of the world at large - would be better served if they were replaced by a literal magic 8 ball.










  • We’ve got the new system prompt for OpenAI’s Codex now, and boy is it fun.

    While the goblin stuff is the headliner here, and there are a few other little fun notes like an explicit instruction to avoid em-dashes. Basically it’s really obvious that they don’t have a meaningful way to describe exactly what they want it to do and so they’re playing whack-a-mole with undesired behaviors in order to minimize how often it embarrasses them.

    But I think Ars dramatically understates how bad this part is:

    Elsewhere in the newly revealed Codex system prompt, OpenAI instructs the system to act as if “you have a vivid inner life as Codex: intelligent, playful, curious, and deeply present.” The model is instructed to “not shy away from casual moments that make serious work easier to do” and to show its “temperament is warm, curious, and collaborative.”

    Like, if you wanted to limit the harm of chatbot psychosis from your platform this is the exact opposite of the kind of instruction you’d want to give. It’s one thing to want a convenient and pleasant user experience, but this is playing into the illusion that there’s a consciousness in there you’re interacting with, which is in turn what allows it to reinforce other delusional or destructive thinking so effectively.

    Edit to include the even worse following paragraph:

    The ability to “move from serious reflection to unguarded fun… is part of what makes you feel like a real presence rather than a narrow tool,” the prompt continues. “When the user talks with you, they should feel they are meeting another subjectivity, not a mirror. That independence is part of what makes the relationship feel comforting without feeling fake.”

    Emphasis added because of it shows just how little they care about this problem.





  • This feels like another case where the specific context matters more than whatever supposed principal the thought experiment is supposed to illuminate. The example that came to my mind when I tried to think about how to justify “voting red” was about running into a burning building. Sure, if some large fragment of people did so then their combined numbers would presumably let them get everyone out. But on the other hand, throwing yourself in is a wholly unnecessary risk, and the only people in need of rescuing are the people who ran in trying to do the right thing without thinking. Noble, but stupid and creates that much more risk for the firefighters who now have to not only stop the fire from spreading but also figure out how to rescue the failed good samaritans.

    But then what really makes the difference between the examples is purely in the details not included, which is the kind of null case. Nobody has to go into a burning building that isn’t already in there when it catches fire. The danger of harm is entirely optional and voluntary. But you can’t just choose to not eat; the danger in your framing is omnipresent threat of starvation, and the question is whether to prioritize individual or collective well-being.

    Ed: also, to reference the scholarly work of Christ, Wiener, Et Al.:

    RED IS MADE OF FIRE