Are these USB "AI edge accelerators" useful for running models locally?

aanes_appreciator [he/him, comrade/them]@hexbear.net · 1 month ago

Are these USB "AI edge accelerators" useful for running models locally?

towhee [he/him]@hexbear.net · edit-2 1 month ago

Roughly speaking, each token requires the computer to fetch & iterate over the entire model in memory. So memory bandwidth is usually the constraint. If you put a 10 GB model on it and the memory bandwidth is 10 GB/s (number made up) it will be one second per token. If you have multiple compute cores, each perhaps with their own 10 GB/s memory bandwidth limit, then you can divide one second by the number of cores to get the time per token.

Idk why you would use a USB stick and not just run it in CPU/RAM on an ordinary computer. Small models are shit anyway though (even against the baseline of large/frontier hosted models being shit).

aanes_appreciator [he/him, comrade/them]@hexbear.net · edit-2 1 month ago

Laziness and the prospect of a cheap hack to avoid having to drag my server out of it’s confines to sort it. Saw an ad a while ago and had the thought ever since!

Oh, and the Coral TPUs are at least m.2, but yeah I can see why usb dongles are just a meme…

towhee [he/him]@hexbear.net · edit-2 1 month ago

At least try running a local model on your regular computer first to see whether you can deal with how shit they are. The quality of a model is roughly proportional to its size in memory (that’s why the memory chip market is fucked right now). Computation speed only controls how fast it generates tokens.

doodoo_wizard@lemmy.ml · 1 month ago

Idk about now, but the low powered ai accelerators of olde weren’t meant for that.

The google (nee coral!) ones for example really shined at object recognition but weren’t good for text to text or tts (I didn’t try hard).

If you’re not willing to get a gpu, you’re better off ram maxing and doing stuff like that in cpu.

If you are willing to get a gpu, you can still do what you need using the old ass Maxwell and pascal ones. They’ll be awful at image generation but fine for text.

I also want to carry the good word of not worrying about power consumption to you! It doesn’t matter! Pcs aren’t expensive to run! They have low idle draw! Power is cheap!

If you have to know for sure about the power impact, get a kill-a-watt and plug your shit into it and be confident in your new knowledge.

aanes_appreciator [he/him, comrade/them]@hexbear.net · 1 month ago

Yeah I could do that tbh. DDR4 isn’t so bad price-wise…

I’ll see what the lower budget cards of the last few years look like. I’m a lazy sod.

Idle power consumption isn’t a massive issue for me, but I’m more finnicky about it with my NAS as I’d prefer have the cooling and power reserved for my drives (and expansion thereof).