Using Mac M2 Ultra 192GB to Self-Host LLMs?

shaserlark@sh.itjust.works · edit-2 2 years ago

Using Mac M2 Ultra 192GB to Self-Host LLMs?

0x01@lemmy.ml · edit-2 2 years ago

I do this on my ultra, token speed is not great, depending on the model of course, a lot of source code sets are optimized for Nvidia and don’t even use native Mac gpu without modifying the code, defaulting to cpu. I’ve had to modify about half of what I run

Ymmv but I find it’s actually cheaper to just use a hosted service

If you want some specific numbers lmk

shaserlark@sh.itjust.works · 2 years ago

Interesting, is there any kind of model you could run at reasonable speed?

I guess over time it could amortize but if the usability sucks that may make it not worth it. OTOH really don’t want to send my data to any company.