Selfhosted LLM (ChatGPT)

autopilot@lemmy.world · edit-2 3 years ago

Selfhosted LLM (ChatGPT)

𝒍𝒆𝒎𝒂𝒏𝒏 · 3 years ago

I personally use llama.cpp in a VM, however if you have a nvidia GPU with lots of VRAM you’ve got more options available, as well as much faster inference (text generation) speed.

Check out the community at !localllama@sh.itjust.works, they’re pretty experienced with running LLMs locally

AdventureSpoon@kbin.social · 3 years ago

why nvida sprecifically?

𝒍𝒆𝒎𝒂𝒏𝒏 · 3 years ago

At the moment most LLM libraries use CUDA for acceleration, which is a hardware feature on nvidia GPUs

I believe llama.cpp can make use of AMD GPUs, but double check the project’s GitHub discussions first to confirm this, and see how people set it up