Bag of words, have mercy on us

corbin@awful.systems · 4 hours ago

Hey now, at least the bowl of salvia has a theme, predictable effects, immersive sensations, and the ability to make people feel emotions.

corbin@awful.systems · 3 days ago

Thanks! You’re getting better with your insults; that’s a big step up from your trite classics like “sweet summer child”. As long as you’re here and not reading, let’s not read from my third link:

As a former musician, I know that there is no way to train a modern musician, or any other modern artist, without heavy amounts of copyright infringement. Copying pages at the library, copying CDs for practice, taking photos of sculptures and paintings, examining architectural blueprints of real buildings. The system simultaneously expects us to be well-cultured, and to not own our culture. I suggest that, of those two, the former is important and the latter is yet another attempt to coerce and control people via subversion of the public domain.

Maybe you’re a little busy with your Biblical work-or-starve mindset, but I encourage you to think about why we even have copyright if it must be flaunted in order to become a skilled artist. It’s worth knowing that musicians don’t expect to make a living from our craft; we expect to work a day job too.

corbin@awful.systems · 3 days ago

Previously, on Awful:

[Copyright i]s not for you who love to make art and prize it for its cultural impact and expressive power, but for folks who want to trade art for money.

Quoting Anarchism Triumphant, an extended sneer against copyright:

I wanted to point out something else: that our world consists increasingly of nothing but large numbers (also known as bitstreams), and that - for reasons having nothing to do with emergent properties of the numbers themselves - the legal system is presently committed to treating similar numbers radically differently. No one can tell, simply by looking at a number that is 100 million digits long, whether that number is subject to patent, copyright, or trade secret protection, or indeed whether it is “owned” by anyone at all. So the legal system we have - blessed as we are by its consequences if we are copyright teachers, Congressmen, Gucci-gulchers or Big Rupert himself - is compelled to treat indistinguishable things in unlike ways.

Or more politely, previously, on Lobsters:

Another big problem is that it’s not at all clear whether information, in the information-theoretic sense, is a medium through which expressive works can be created; that is, it’s not clear whether bits qualify for copyright. Certainly, all around the world, legal systems have assumed that bits are a medium. But perhaps bits have no color. Perhaps homomorphic encryption implies that color is unmeasurable. It is well-accepted even to legal scholars that abstract systems and mathematics aren’t patentable, although the application of this to computers clearly shows that the legal folks involved don’t understand information theory well enough.

Were we anti-copyright leftists really so invisible before, or have you been assuming that No True Leftist would be anti-copyright?

corbin@awful.systems · 3 days ago

Closely related is a thought I had after responding to yet another paper that says hallucinations can be fixed:

I’m starting to suspect that mathematics is not an emergent skill of language models. Formally, given a fixed set of hard mathematical questions, it doesn’t appear that increasing training data necessarily improves the model’s ability to generate valid proofs answering those questions. There could be a sharp divide between memetically-trained models which only know cultural concepts and models like Gödel machines or genetic evolution which easily generate proofs but have no cultural awareness whatsoever.

corbin@awful.systems · 4 days ago

“Not Winston Smith?” So, O’Brien?

corbin@awful.systems · 8 days ago

Boring unoriginal argument combined with a misunderstanding of addiction. On addiction, go read FOSB and stop thinking of it as a moral failing. On behavioral control, it’s clear that you didn’t actually read what I said. Let me emphasize it again:

The problem isn’t people enjoying their fetishes; the problem is the financial incentives and resulting capitalization of humans leading to genuine harms.

From your list, video games, TV, D&D, and group sex are not the problem. Rather, loot boxes, TV advertisements, churches, MLMs, and other means of psychological control are the problem. Your inability to tell the difference between a Tupperware party (somewhat harmful), D&D (almost never harmful), and joining churches (almost always harmful) suggests that you’re thinking of behavioral control in terms of rugged individualist denial of any sort of community and sense of belonging, rather than in terms of the harms which people suffer. Oh, also, when you say:

One cannot rescue such people by condemning what they do, much like one cannot stop self destruction by banning the things they use.

Completely fucking wrong. Condemning drunk driving has reduced the overall amount of drunk driving, and it also works on an interpersonal level. Chemists have self-regulated to prevent the sale of massive quantities of many common chemicals, including regulation on the basis that anybody purchasing that much of a substance could not do anything non-self-destructive with it. What you mean to say is that polite words do not stop somebody from consuming an addictive substance, but it happens to be the case that words are only the beginning of possible intervention.

corbin@awful.systems · 9 days ago

Well, imagine a romance novel that tries to manipulate you. For example, among the many repositories of erotica on the Web, there are scripts designed to ensnare and control the reader, disguised as stories about romance. By reading a story, or watching a video, or merely listening to some well-prepared audio file, a suggestible person can be dramatically influenced by a horny tale. It is common for the folks who make such pornography to include a final suggestion at the end; if you like what you read/heard/saw, subscribe and send money and obey. This eventually leads to findom: the subject becomes psychologically or sexually gratified by the act of being victimized in a blatant financial scam, leading to the subject seeking out further victimization. This is all a heavily sexualized version of the standard way that propaganda (“public relations”, “advertising”) is used to induce compulsive shopping disorders; it’s not just a kinky fetish thing. And whether they like it or not, products like OpenAI’s ChatGPT are necessarily reinforcement-learned against saying bad things about OpenAI, which will lead to saying good things about OpenAI; the product will always carry its trainer’s propaganda.

Or imagine a romance novel that varies in quality by chapter. Some chapters are really good! But maybe the median chapter is actually not very good. Maybe the novel is one in a series. Maybe you have an entire shelf of novels, with one or two good chapters per novel, and you can’t wait to buy the next one because it’ll have one good chapter maybe. This is the sort of gambling addiction that involves sitting at a slot machine and pulling it repeatedly. Previously, on Awful (previously on Pivot to AI, even!) we’ve discussed how repeatedly prompting a chatbot is like pulling a slot machine, and the users of /r/MyBoyfriendIsAI do appear to tell each other that sometimes reprompting or regenerating responses will be required in order to ~~sustain the delusion~~ maximize the romantic charm of their electronic boyfriend.

I’m not saying this to shame the folks into erotic mind control or saying that it always leads to findom, just to be clear. The problem isn’t people enjoying their fetishes; the problem is the financial incentives and resulting capitalization of humans leading to genuine harms. (I am shaming people who are into gambling. Please talk about your issues with your family and be open to reconciliation.)

corbin@awful.systems · 9 days ago

I tried to substantiate the claim that multiple users from that subreddit are self-hosting. Reading the top 120 submissions, I did find several folks moving to Grok (1, 2, 3) and Mistral’s Le Chat (1, 2, 3). Of those, only the last two appear to actually have discussion about self-hosting; they are discussing Mistral’s open models like Mistral-7B-Instruct which indeed can be run locally. For comparison, I also checked the subreddit /r/LocalLLaMA, which is the biggest subreddit for self-hosting language models using tools like llama.cpp or Ollama; there’s zero cross-posts from /r/MyBoyfriendIsAI or posts clearly about AI boyfriends in the top 120 submissions there. That is, I found no posts that combine tools like llama.cpp or Ollama and models like Mistral-7B-Instruct into a single build-your-own-AI-boyfriend guide. Amusingly, one post gives instructions for how to ask ChatGPT about how to set up Ollama.

Also, I did find multiple gay and lesbian folks; this is not a sub solely for women or heterosexuals. Not that any of our regular commenters were being jerks about this, but it’s worth noting.

What’s more interesting to me are the emergent beliefs and descriptors in this community. They have a concept of “being rerouted;” they see prompted agents as a sort of nexus of interconnected components, and the “routing” between those components controls the bot’s personality. Similarly, they see interactions with OpenAI’s safety guardrails as interactions with a safety personality, and some users have come to prefer it over the personality generated by ChatGPT-4o or ChatGPT-5. Finally, I notice that many folks are talking about bot personalities as portable between totally different models and chat products, which is not a real thing; it seems like users are overly focused on specific memorialized events which linger in the chat interface’s history, and the presence of those events along with a “you are my perfect boyfriend” sort of prompt is enough to ~~trigger a delusional episode~~ summon the perfect boyfriend for a lovely evening.

(There’s some remarkable bertology in there, too. One woman’s got a girlfriend chatbot fairly deep into a degenerated distribution such that most of its emitted tokens are asterisks, but because of the Markdown rendering in the chatbot interface, the bot appears to shift between italic and bold text and most asterisks aren’t rendered. It’s a cool example of a productive low-energy distribution.)

corbin@awful.systems · 10 days ago

Things I don’t want to know more about: there’s a reasonable theory that Eigenrobot is influencing USA politics; certain magic numbers in Eigen’s tweets have been showing up in some of the protectionism coming out of the White House. Stubbing this mostly in the hope that somebody else feels like doing the research.

corbin@awful.systems · 11 days ago

Community sneer from this orange-site comment:

We know from Bell’s theorem that any locally causal model that correctly describes observations needs to violate measurement independence. Such theories are sometimes called “superdeterministic”. It is therefore clear that to arrive at a local collapse model, we must use a superdeterministic approach.

I only got the first 1/2 of my physics degree before moving on to CS, but to me this reads as “We know eternal life can only be obtained from unicorn blood, so for this paper we must use a fairytale approach.”

corbin@awful.systems · 11 days ago

Thanks, this was an awful skim. It feels like she doesn’t understand why we expect gravity to propagate like a wave at the speed of light; it’s not just an assumption of Einstein but has its own independent measurement and corroboration. Also, the focus on geometry feels anachronistic; a century ago she could have proposed a geometric explanation for why nuclei stay bound together and completely overlooked gluons. To be fair, she also cites GRW but I guess she doesn’t know that GRW can’t be made relativistic. Maybe she chose GRW because it’s not yet falsified rather than for its potential to explain (relativistic) gravity. The point at which I get off the train is a meme that sounds like a Weinstein whistle:

What I am assuming here is then that in the to-be-found underlying theory, geometry carries the same information as the particles because they are the same. Gravity is in this sense fundamentally different from the other interactions: The electromagnetic interaction, for example, does not carry any information about the mass of the particles. … Concretely, I will take this idea to imply that we have a fundamental quantum theory in which particles and their geometry are one and the same quantum state.

To channel dril a bit: there’s no inherent geometry to spacetime, you fool. You trusted your eyeballs too much. Your brain evolved to map 2D and 3D so you stuck yourself into a little Euclidean video game like Decartes reading his own books. We observe experimental data that agrees with the presumption of 3D space. We already know that time is perceptual and that experimentally both SR and GR are required to navigate spacetime; why should space not be perceptual? On these grounds, even fucking MOND has a better basis than Geometric Unity, because MOND won’t flip out if reality is not 3D but 3.0000000000009095…D while Weinstein can’t explain anything that isn’t based on a Rubik’s-cube symmetry metaphor.

She doesn’t even mention dark matter. What a sad pile of slop. At least I learned the word for goldstinos while grabbing bluelinks.

corbin@awful.systems · edit-2 12 days ago

Obituaries are being run for John Searle. Most obituaries will focus on the Chinese Room thought experiment, an important bikeshed in AI research noted for the ease with which freshmen can incorrectly interpret it. I’m glad to see that Wikipedia puts above the Chinese Room the fact that he was a landlord who sued the city of Berkeley and caused massive rent increases in the 1990s; I’m also happy that Wikipedia documents his political activity and sexual-assault allegations.

corbin@awful.systems · 14 days ago

On a theoretical basis, this family of text-smuggling attacks can’t be prevented. Indeed, the writeup for the Copilot version, which Microsoft appears to have mitigated, suggested that some filtering of forbidden Unicode would be much easier than some fundamental fix. The underlying confusable deputy is still there and core to the product as advertised. On one hand, Google is right; it’s only exploitable via social engineering or capability misuse. On the other hand, social engineering and capability misuse are big problems!

This sort of confused-deputy attack is really common in distributed applications whenever an automatic process is doing something on behalf of a human. The delegation of any capability to a chatbot is always going to lead to possible misuse because of one of the central maxims of capability security: the ability to invoke a capability is equivalent to the permission to invoke it. Also, in terms of linguistics and narremes, it is well-known that merely mentioning that a capability exists will greatly raise the probability that the chatbot chooses to invoke it, not unlike how a point-and-click game might provoke a player into trying every item at every opportunity. I’ll close with a quote from that Copilot writeup:

Automatic Tool Invocation is problematic as long as there are no fixes for prompt injection as an adversary can invoke tools that way and (1) bring sensitive information into the prompt context and (2) probably also invoke actions.

corbin@awful.systems · 15 days ago

second bongrip Manjaro is an indoctrination program to load up Linux newbies with stupid questions before sending them to Gentoo forums~

corbin@awful.systems · 17 days ago

Good post but it’s overfocused on “technical” as a meaningful and helpful word for denotation. Quoting what I just said on Mastodon:

To be technical is to pay attention to details. That’s all. A (classical) computer is a detail machine; it only operates upon bits, it only knows bits, and it only decides bits. To be technical is to try to keep pace with the computer and know details as precisely as it does. Framed this way, it should be obvious that humans aren’t technical and can’t really be technical. This fundamental insecurity is the heart of priestly gatekeeping of computer science.

If a third blog post trying to define “technical” goes around again then I’ll write a full post.

corbin@awful.systems · 17 days ago

Yes, and it’s been this way since the 90s. The original slop algorithm, Dissociated press, was given in 1972 (in HAKMEM!) and has been operationalized since the mid-80s.

corbin@awful.systems · 19 days ago

I guess I’m the local bertologist today; look up Dr. Bender for a similar take.

When we say that LLMs only have words, we mean that they only manipulate syntax with first-order rules; the LLM doesn’t have a sense of meaning, only an autoregressive mapping which associates some syntax (“context”, “prompt”) to other syntax (“completion”). We’ve previously examined the path-based view and bag-of-words view. Bender or a category theorist might say that syntax and semantics are different categories of objects and that a mapping from syntax to semantics isn’t present in an LLM; I’d personally say that an LLM only operates with System 3 — associative memetic concepts — and is lacking not only a body but also any kind of deliberation. (Going further in that direction, the “T” in “GPT-4” is for Transformers; unlike e.g. Mamba, a Transformer doesn’t have System 2 deliberation or rumination, and Hofstadter suggests that this alone disqualifies Transformers from being conscious.)

If you made a perfect copy of me, a ‘model’, I think it would have consciousness. I would want the clone treated well even if some of the copied traits weren’t perfect.

I think that this collection of misunderstandings is the heart of the issue. A model isn’t a perfect copy. Indeed, the reason that LLMs must hallucinate is that they are relatively small compared to their training data and therefore must be lossy compressions, or blurry JPEGs as Ted Chiang puts it. Additionally, no humans are cloned in the training of a model, even at the conceptual level; a model doesn’t learn to be a human, but to simulate what humans might write. So when you say:

Spinal injuries are terrible. I don’t think ‘text-only-human’ should fail the consciousness test.

I completely agree! LLMs aren’t text-only humans, though. An LLM corresponds to a portion of the left hemisphere, particularly Broca’s area, except that it drives a tokenizer instead; chain-of-thought “thinking” corresponds to rationalizations produced by the left-brain interpreter. Humans are clearly much more than that! For example, an LLM cannot feel hungry because it does not have a stomach which emits a specific hormone that is interpreted by a nervous system; in this sense, LLMs don’t have feelings. Rather, what should be surprising to you is the ELIZA effect: a bag of words that can only communicate by mechanically associating memes to inputs is capable of passing a Turing test.

Also, from one philosopher to another: try not to get hung up on questions of consciousness. What we care about is whether we’re allowed to mistreat robots, not whether robots are conscious; the only reason to ask the latter question is to have presumed that we may not mistreat the conscious, a hypocrisy that doesn’t withstand scrutiny. Can matrix multiplication be conscious? Probably not, but the shape of the question (“chat is this abstractum aware of itself, me, or anything in its environment”) is kind of suspicious! For another fun example, IIT is probably bogus not because thermostats are likely not conscious but because “chat is this thermostat aware of itself” is not a lucid line of thought.

corbin@awful.systems · 19 days ago

I think it’s the other way around. The memes are incredibly good at left vs right because left- and right-leaning people presume underlying facts and the memes reassure people that those facts are true and good (or false and bad, etc.) without doing any fact-finding.

When we say “the right can’t meme” what we mean is that the right’s memes are about projecting bigotry. It’s like saying that the right has no comedians; of course they have people that stand up in front of an audience and emit words according to memes, tropes, and narremes, such that the audience laughs. Indeed, stand-up was invented by Frank Fay, an open fascist. (His Behind the Bastards episodes are quite interesting.) What we’re saying is that the stand-up routine is bigoted. If this seems unrelated, please consider: the Haitians-eating-pets joke is part of a stand-up routine that a clown tells in order to get his circus elected.

corbin@awful.systems · 21 days ago

My name is Schmidt F. I’m 27 years old. My house is in the Mennonite region of Dutch Pennsylvania, where all the farms are, and I am trad-married. I work as the manager for the Single Sushi matchmaking service, and I get home every day by sunset at the latest. I don’t smoke, but I occasionally drink. I’m in bed by two candles and make sure I sleep until sunrise, no matter what. After having a glass of warm unpasteurized milk and doing about twenty minutes of prayer before going to bed, I usually have no problems sleeping until morning. Just like a real Mennonite, I wake up without any fatigue or stress in the morning. I was told there were no issues at my last one-on-one with my pastor. I’m trying to explain that I’m a person who wishes to live a very quiet life, as long as I have Internet access. I take care not to trouble myself with any enemies, like JavaScript and Python, that would cause me to lose sleep at night. That is how I deal with society, and I think that is what brings me happiness. Although, if I were to write code I wouldn’t lose to anyone.

corbin@awful.systems · 22 days ago

Funnier: Yes, it’s what happens today, and Silicon Valley is old enough that we can compare and contrast with the beginning of techbro art! The original techbro film is Toy Story (1995), which is much weirder if viewed with e.g. the precept that Buzz’s designers are Elon fans or the idea that (some of) the toys are robots. Of course, from the outside, AI toy robots make folks think of Small Soldiers (1998); “generic” and “slop” are definitely part of the style. Also, as long as we’re talking of “pearly blobs” I have to bring up The Abyss (1989) before anybody else. I hope at least one of these is a lucky 10000 for you because they’re all classic films.