cross-posted from: https://lemmy.ml/post/45766694
Hey :) For a while now I use gpt-oss-20b on my home lab for lightweight coding tasks and some automation. I’m not so up to date with the current self-hosted LLMs and since the model I’m using was released at the beginning of August 2025 (From an LLM development perspective, it feels like an eternity to me) I just wanted to use the collective wisdom of lemmy to maybe replace my model with something better out there.
Edit:
Specs:
GPU: RTX 3060 (12GB vRAM)
RAM: 64 GB
gpt-oss-20b does not fit into the vRAM completely but it partially offloaded and is reasonably fast (enough for me)


Definitely give Gemma4 26ba4b a try
It’s MOE so you should be able to get the same offload, and a4b can be plenty fast.
It has decent world knowledge for the size, and from what I can tell is any at small scale coding in common languages like Python.