

That’s unfair.
Beaker deserves better than to get compared to a eugenicist cryptofascist.
That’s unfair.
Beaker deserves better than to get compared to a eugenicist cryptofascist.
Fellas it’s almost June in the year of the “agents” and frankly I don’t see shit.
LLM agents can beat Pokemon… if you give them enough customized tools and prompting that with the same number of lines of instruction you could just directly code a bot that beats Pokemon without an LLM in the first place. And you don’t mind the LLM agent playing much much worse than literal children.
Yeah I pretty much agree. Penrose compares favorably to other cases of noble disease because the bar is so low (the Wikipedia page has got examples of racism, eugenics, homeopathy, astrology), not because his ideas about Quantum consciousness are actually good. It’s not good to cite Penrose as someone notable who disagrees with the possibility of AGI because the reason he disagree is because he believes in Quantum mysticism and misunderstands Godel’s theorem and computer science.
Yeah it’s really not productive to engage directly.
I’d almost categorize Penrose as a borderline case of noble disease himself for stuff he’s said about Quantum Consciousness and relatedly the halting problem and Godel’s incompleteness theorem. But he actually has a proposed mechanism (involving microtubules) that is testable and falsifiable and the physics half of what he is talking about is within his domain of expertise.
Stephen Hawking was starting to promote AI doomerism in 2014. But he’s not a Nobel prize winner. Yoshua Bengio is a doomer, but no Nobel prize either, although he is pretty decorated in awards. So yeah looks like one winner and a few other notable doomers that aren’t actually Nobel Prize winners somehow became winners plural in Scott’s argument from authority. Also, considering the long list of example of Noble Disease, I really don’t think Nobel Prize winner endorsement is a good way to gauge experts’ attitudes or sentiment.
He claims he was explaining what others believe not what he believes, but if that is so, why are you so aggressively defending the stance?
Literally the only difference between Scott’s beliefs and AI:2027 as a whole is his prophecy estimate is a year or two later. (I bet he’ll be playing up that difference as AI 2027 fails to happen in 2027, then also doesn’t happen in 2028.)
Elsewhere in the thread he whines to the mods that the original poster is spamming every subreddit vaguely lesswrong or EA related with engagement bait. That poster is katxwoods… as in Kat Woods… as in a member of Nonlinear, the EA “organization” whose idea of philanthropic research was nonstop exotic vacations around the world. And, iirc, they are most infamous among us sneerer for “hiring” an underpaid (really underpaid, like couldn’t afford basic necessities) intern they also used as a 24/7 live-in errand girl, drug runner, and sexual servant.
Yeah, allowing the framing that blog post uses is already conceding a lot to EA and overlooking the bigger problems they have.
Yeah I think long term Trump wrecking US soft power might be good for the world. There is going to be a lot of immediate suffering because a lot of those programs were also doing good things (in addition to strengthening US soft power or pushing a neocolonial agenda or whatever else).
I was just about to point out several angles this post neglects but it looks like from the edit this post is just intended to address a narrower question. Among the angles outside the intended question: philanthropy by the ultra-wealthy often serves as a tool for reputation laundering and influence building. I guess the same criticism can be made about a lot of conventional philanthropy, but I don’t think that should absolve EA.
This post somewhat frames the question as a comparison between EA and conventional philanthropy and foreign aid efforts… which okay, but that is a low bar especially when you look at some of the stuff the US has done with it’s foreign aid.
The prompt’s random usage of markup notations makes obtuse black magic programming seem sane and deterministic and reproducible. Like how did they even empirically decide on some of those notation choices?
You can make that point empirically just looking at the scaling that’s been happening with ChatGPT. The Wikipedia page for generative pre-trained transformer has a nice table. Key takeaway, each model (i.e. from GPT-1 to GPT-2 to GPT-3) is going up 10x in tokens and model parameters and 100x in compute compared to the previous one, and (not shown in this table unfortunately) training loss (log of perplexity) is only improving linearly.
He also wants instant gratification, so taking months to have a team put together a racist data set is a lot of effort for him.
This is especially ironic with all of Elon’s claims about making Grok truth seeking. Well, “truth seeking” was probably always code for making an LLM that would parrot Elon’s views.
Elon may have failed at making Grok peddle racist conspiracy theories like he wanted, but this shouldn’t be taken as proof that LLMs can’t be manipulated that way. He probably went with the laziest option possible of directly prompting it as opposed to fine tuning it on racist content or anything more advanced.
Do you like SCP foundation content? There is an SCP directly inspired by Eliezer and lesswrong. It’s kind of wordy and long. And in the discussion the author waffled on owning that it was a mockery of Eliezer.
I think they also want recognition/credit for spending 5 minutes (or less) typing some words at an image generator as if that were comparable to people who develop technical skills and then create effortful meaningful work just because the outputs are (superficially) similar.
You had me going until the very last sentence. (To be fair to me, the OP broke containment and has attracted a lot of unironically delivered opinions almost as bad as your satirical spiel.)
The latest twist I’m seeing isn’t blaming your prompting (although they’re still eager to do that), it’s blaming your choice of LLM.
“Oh, you’re using shitGPT 4.1-4o-o3 mini _ro_plus for programming? You should clearly be using Gemini 3.5.07 pro-doubleplusgood, unless you need something locally run, then you should be using DeepSek_v2_r_1 on your 48 GB VRAM local server! Unless you need nice sounding prose, then you actually need Claude Limmerick 3.7.01. Clearly you just aren’t trying the right models, so allow me to educate you with all my prompt fondling experience. You’re trying to make some general point? Clearly you just need to try another model.”
It can make funny pictures, sure. But it fails at art as an endeavor to communicate an idea, feeling, or intent of the artist, the promptfondler artists are providing a few sentences instruction and the GenAI following them without any deeper feelings or understanding of context or meaning or intent.
GPT-1 is 117 million parameters, GPT-2 is 1.5 billion parameters, GPT-3 is 175 billion, GPT-4 is undisclosed but estimated at 1.7 trillion. Token needed for training and training compute scale linearly (edit: actually I’m wrong, looking at the wikipedia page… so I was wrong, it is even worse for your case than I was saying, training compute scales quadratically with model size, it is going up 2 OOM for every 10x of parameters) with model size. They are improving … but only getting a linear improvement in training loss for a geometric increase in model size, training time. A hypothetical GPT-5 would have 10 trillion training parameters and genuinely need to be AGI to have the remotest hope of paying off it’s training. And it would need more quality tokens than they have left, they’ve already scrapped the internet (including many copyrighted sources and sources that requested not to be scrapped). So that’s exactly why OpenAI has been screwing around with fine-tuning setups with illegible naming schemes instead of just releasing a GPT-5. But fine-tuning can only shift what you’re getting within distribution, so it trades off in getting more hallucinations or overly obsequious output or whatever the latest problem they are having.
Lower model temperatures makes it pick it’s best guess for next token as opposed to randomizing among probable guesses, they don’t improve on what the best guess is and you can still get hallucinations even picking the “best” next token.
And lol at you trying to reverse the accusation against LLMs by accusing me of regurgitating/hallucinating.
Yep. If you’re looking for a snappy summary of this situation, this reddit comment had a nice summary. An open source LLM Pokemon harness/scaffold has 4.8k lines of python, and is missing features essential to Gemini’s harness. Whereas an open source LUA script to play Pokemon is 7.2k lines, was written in 2014, and it consistently speed runs the game in under two hours.