• 14 Posts
  • 170 Comments
Joined 2 years ago
cake
Cake day: July 19th, 2023

help-circle


  • The books look alright. I only read the samples. The testimonials from experts are positive. Maybe compare and contrast with Lox from Crafting Interpreters, whose author is not an ally but not known evil either. In terms of language design, there’s a lot of truth to the idea that Monkey is a boring ripoff of Tiger, which itself is also boring in order to be easier to teach. I’d say that Ball’s biggest mistake is using Go as the implementation language and not explaining concepts in a language-neutral fashion, which makes sense when working on a big long-lived project but not for a single-person exploration.

    Actually, it makes a lot of sense that somebody writing a lot of Go would think that an LLM is impressive. Also, I have to sneer at this:

    Each prompt I write is a line I cast into a model’s latent space. By changing this word here and this phrase there, I see myself as changing the line’s trajectory and its place amidst the numbers. Words need to be chosen with care, since they all have a specific meaning and end up in a specific place in latent space once they’ve been turned into numbers and multiplied with each other, and what I want, what I aim for when I cast, is for the line to end up in just the right spot, so that when I pull on it out of the model comes text that helps me program machines.

    Dude literally just discovered word choice and composition. Welcome to writing! I learned about this in public education when I was maybe 14.


  • I’m guessing that you’re too young to remember. Lucky 10000! In the 1990s, McDonald’s was under attack for a variety of anti-environmentalist practices, and by 2001 there was a class-action lawsuit against them for using beef tallow in fries from a coalition of vegetarians, vegans, and primarily Hindus who were deeply offended that they had been tricked into consuming what they consider to be a sacred animal. In a nutshell, it’s a very racist and revanchist move, not just an anti-environmentalist move.

    Unlike normal, I can’t link to good peer-reviewed articles on the topic. McDonald’s is one of the few groups who can successfully control their Internet presence, and they’ve washed away these controversies as best they can. I almost feel like linking to this summary of the case on Wikipedia is unhelpful, since it’s got so many apologetic caveats. They do this all over Wikipedia; McLibel or Liebeck are also heavily edited in favor of McDonald’s. You’ll have to explicitly add “hindu” or “indian” to search queries; for example, instead of “mcdonalds beef tallow”, try “mcdonalds beef tallow hindu indians”.





  • I guess that I’m the resident compiler engineer today. Let’s go.

    So why not write an optimizing compiler in its own language, and then run it on itself?

    The process will reach a fixed point after three iterations. In fancier language, Glück 2009 shows that the fourth, fifth, and sixth Futamura projections are equivalent to the third Futamura projection for a fixed choice of (compiler-)compiler and optimizer. This has practical import for cross-compiling; when I used to use Gentoo, I would watch GCC build itself exactly three times, and we still use triples in our targets today.

    [S]uppose you built an optimizing compiler that searched over a sufficiently wide range of possible optimizations, that it did not ordinarily have time to do a full search of its own space — so that, when the optimizing compiler ran out of time, it would just implement whatever speedups it had already discovered.

    Oh, it’s his lucky day! Yud, you’ve just been Schmidhuber’d! Starting in 2003, Schmidhuber’s lab has published research on Gödel machines, self-improving machines which prove that their self-modifications will always be better than previous iterations. They are named not just after Gödel, but after his First Incompleteness Theorem; Schmidhuber et al proved easily that there will always be at least one speedup theorem which a Gödel machine can never reach (for a given choice of axioms, etc.)

    EURISKO used “heuristics” to, for example, design potential space fleets. It also had heuristics for suggesting new heuristics, and metaheuristics could apply to any heuristic, including metaheuristics. … EURISKO could modify even the metaheuristics that modified heuristics. … Still, EURISKO ran out of steam. Its self-improvements did not spark a sufficient number of new self-improvements.

    Once again the literature on metaheuristics exists, and it culminates in the discovery of genetic algorithms. As such, we can immediately apply the concept of gene-oriented evolution (“beanbag” or “gene pool” reasoning) and note that, if goals don’t change and new genes don’t enter the pool, then eventually the population stagnates as the possible range of mutated genes is tested and exhausted. It doesn’t matter that some genes are “meta” genes that act on other genes, nor that such actions are indirect. Genes are genes.

    I’m gonna close with a sneer from Jay Bellou, who I hope is not a milkshake duck, in the comments:

    All “insights” eventually bottom out in the same way that Eurisko bottomed out; the notion of ever-increasing gain by applying some rule or metarule is a fantasy. You make the same sort of mistake about “insight” as do people like Roger Penrose, who believes that humans can “see” things that no computer could, except that you think that a computer can too, whereas in reality neither humans nor computers have access to any such magical “insight” sauce.





  • A lot of court documents are sealed or redacted, so I can’t quite get at all the details. Nonetheless here’s what I’ve got so far:

    • Chrome is just the browser, including Chromium, but not ChromiumOS (a Gentoo fork, basically) or ChromeOS (the branded OS on Chromebooks)
    • Chrome is unaffordable because it was quite expensive to build and continues to be a maintenance burden
    • The government is vaguely aware that forcing a sale of Chrome could be adverse for the market but the court hasn’t said anything on the topic yet
    • Via filing from Apple, the court is aware that Firefox materially depends on Google, although they haven’t done much beyond allow Apple to file as amicus

    The court hasn’t cracked open AMD v Intel yet, where it was found that a cash remedy would be better than punishing the ongoing business concerns of a duopoly, but it would be one possible solution: instead of selling Chrome, Google would have to pay its competitors a lump sum and change their business practices somewhat.

    I am genuinely not sure what happens to “the browser market”, as it were. The Brave and Safari teams are relatively small because they make tweaks on top of an existing browser core; the extreme propagation of Electron suggests that once a browser is written, it does not need to be written again. The court may find browsers to be a sort of capital which is worth a lot of money on its own but not expensive to maintain. This would destroy Mozilla along with Google!




  • It’s the cost of the electricity, not the cost of the GPU!

    Empirically, we might estimate that a single training-capable GPU can pull nearly 1 kilowatt; an H100 GPU board is rated for 700W on its own in terms of temperature dissipation and the board pulls more than that when memory is active. I happen to live in the Pacific Northwest near lots of wind, rivers, and solar power, so electricity is barely 18 cents/kilowatt-hour and I’d say that it costs at least a dollar to run such a GPU (at full load) for 6hrs. Also, I estimate that the GPU market is currently offering a 50% discount on average for refurbished/like-new GPUs with about 5yrs of service, and the H100 is about $25k new, so they might depreciate at around $2500/yr. Finally, I picked the H100 because it’s around the peak of efficiency for this particular AI season; local inference is going to be more expensive when we do apples-to-apples units like tokens/watt.

    In short, with bad napkin arithmetic, an H100 costs at least $4/day to operate while depreciating only $6.85/day or so; operating costs approach or exceed the depreciation rate. This leads to a hot-potato market where reselling the asset is worth more than operating it. In the limit, assets with no depreciation relative to opex are treated like securities, and we’re already seeing multiple groups squatting like dragons upon piles of nVidia products while the cost of renting cloudy H100s has jumped from like $2/hr to $9/hr over the past year. VCs are withdrawing, yes, and they’re no longer paying the power bills.


  • I went into this with negative expectations; I recall being offended in high school that The Flashbulb was artificially sped up, unlike my heroes of neoclassical guitar and progressive-rock keyboards, and I’ve felt that their recent thoughts on newer music-making technology have been hypocritical. That said, this was a great video and I’m glad you shared it.

    Ears and eyes are different. We deconvolve visual data in the brain, but our ears actually perform a Fourier decomposition with physical hardware. As a result, psychoacoustics is a real and non-trivial science, used e.g. in MP3, which limits what an adversary can do to frustrate classification or learning, because the result still has to sound like music in order to get any playtime among humans. Meanwhile I’m always worried that these adversarial groups are going to accidentally propagate something like McCollough stripes, a genuine cognitohazard that causes edges to become color-coded in the visual cortex for (up to) months after a few minutes of exposure; it’s a kind of possible harm that fundamentally defies automatic classification by definition.

    HarmonyCloak seems like a fairly boring adversarial tool for protecting the music industry from the music industry. Their code is incomplete and likely never going to get properly published; again we’re seeing an industry-capture research group taking and not giving back to the Free Software community. I think all of the demos shown here are genuine, but he fully admits that this is a compute-intensive process which I estimate is going to slide back out of affordability by the end of 2026. This is going to stop being effective as soon as we get back into AI winter, but I’m not going to cry for Nashville.

    I really like the two attacks shown near the end, starting around 22:00. The first attack, if genuinely not audible to humans, is likely a Mosquito-style frequency that is above hearing range and physically vibrates the components of the microphone. Hofstadter and the Tortoise would be proud, although I’m concerned about the potential long-term effects on humans. The second attack is again adversarial but specific to models on home-assistant devices which are trained to ignore some loud sounds; I can’t tell spectrographically whether that’s also done above hearing range or not. I’m reluctant to call for attacks on home assistants, but they’re great targets.

    Fundamentally this is a video that doesn’t want to talk about how musicians actually rip each other off. The “tones and rhythms” that he keeps showing with nice visualizations have been machine-learnable for decades, ranging from beat-finders to frequency-analyzers to chord-spellers to track-isolators built into our music editors. He doubles down on copyright despite building businesses that profit from Free Software. And, most gratingly, he talks about the Pareto principle while ignoring that the typical musician is never able to make a career out of their art.



  • In practice, the behaviors that the chatbots learn in post-training are FUD and weasel-wording; they appear to not unlearn facts, but to learn so much additional nuance as to bury the facts. The bots perform worse on various standardized tests about the natural world after post-training; there are quantitative downsides to forcing them to adopt any particular etiquette, including speaking like a chud.

    The problem is mostly that the uninformed public will think that the chatbot is knowledgeable and well-spoken because it rattles off the same weak-worded hedges as right-wing pundits, and it’s addressed by the same improvements in education required to counter those pundits.

    Answering your question directly: no, slop machines can’t be countered with more slop machines without drowning us all in slop. A more direct approach will be required.


  • Yes, but the article’s not actually about that. It’s about Microsoft returning to the same datacenter-building schedule from a decade ago. Datacenters have a lag of about 3-5yrs depending on what’s inside them and where they’re located, so what we’re actually seeing is Microsoft projecting a relative reduction in overall usage. Note that among all the cancellations of notes and prospective claims, Microsoft isn’t walking back their two-decade nuclear-power deal with Westinghouse; they’re not destroying or reducing any existing capacity, just planning to build less. At risk of quoting Bloomberg:

    After a frantic expansion to support OpenAI and other artificial intelligence projects, [Microsoft] expects spending to shift from new construction to fitting out data centers with servers and other equipment.

    To the extent that the bubble is popping, Microsoft and other datacenter owners have to guess half a decade in advance when the bubble will pop, and if you take them at their word — that is, if we assume that they canceled these contracts with perfect foresight — then the bubble must have already popped in 2023-2024, and the market is experiencing coyote time because…? More likely, this is fallout from their ongoing breakup with OpenAI, who almost certainly begged Microsoft for so much compute (and definitely begged for too many nVidia GPUs!) that Microsoft had to adjust their datacenter plans. The bubble’s not done until OpenAI has exhausted all possible funding, say in late 2025 or early 2026 when Softbank and the Saudis realize that they’ve made a hilarious mistake.

    We’ve discussed this previously on awful.systems, both the value of nuclear-energy contracts and Microsoft’s retraction of intents.