I watched Nvidia's Computex 2024 keynote and it made my blood run cold

Jacobo Da Riva Muñoz@lemmy.sdf.org · 1 year ago

I watched Nvidia's Computex 2024 keynote and it made my blood run cold

Flying Squid@lemmy.world · 1 year ago

I’ve had multiple people on Lemmy tell me that the amount of energy LLMs use will be trivial. They always base it on the amount of energy used to train the LLMs, not the millions (billions? trillions?) of calculations those LLMs have to do every second they’re used by who knows how many people 24 hours a day.

Then you bring up the water wasting and the best they can do is say something like, “okay, that’s a problem… but only in some places!”

(Some places including much of the United States. Guess where lots of the data centers are?)

sunstoned@lemmus.org · edit-2 1 year ago

I don’t disagree, but it is useful to point out there are two truths in what you wrote.

The energy use of one person running an already trained model on their own hardware is trivial.

Even the energy use of many many people using already trained models (ChatGPT, etc) is still not the problem at hand (probably on the order of the energy usage from a typical search engine).

The energy use in training these models (the appendage measuring contest between tech giants pretending they’re on the cusp of AGI) is where the cost really ramps up.

Flying Squid@lemmy.world · 1 year ago

(probably on the order of the energy usage from a typical search engine).

I find that hard to believe. Search engines just regurgitate what is in a database. LLMs have to do calculations to create the sentences they produce. That takes more energy.

sunstoned@lemmus.org · edit-2 1 year ago

Believe what you will. I’m not an authority on the topic, but as a researcher in an adjacent field I have a pretty good idea. I also self host Ollama and SearXNG (a metasearch engine, to be clear, not a first party search engine) so I have some anecdotal inclinations.

Training even a teeny tiny LLM or ML model can run a typical gaming desktop at 100% for days. Sending a query to a pretrained model hardly even shows up on HTop unless it’s gigantic. Even the gigantic models only spike the CPU for a few seconds (until the query is complete). SearXNG, again anecdotally, spikes my PC about the same as Mistral in Ollama.

I would encourage you to look at more explanations like the one below. I’m not just blowing smoke, and I’m not dismissing the very real problem of massive training costs (in money, energy, and water) that you’re pointing out.

https://www.baeldung.com/cs/chatgpt-large-language-models-power-consumption