DeepSeek just released updated r1 models with 'deeper and more complex reasoning patterns'. Includes a r1 distilled qwen3 8b model boasting "10% improved performance" over original

SmokeyDope@lemmy.world · edit-2 7 months ago

DeepSeek just released updated r1 models with 'deeper and more complex reasoning patterns'. Includes a r1 distilled qwen3 8b model boasting "10% improved performance" over original

Even_Adder@lemmy.dbzer0.com · 7 months ago

I’ve gotten the deepseek-r1-0528-qwen3-8b to answer correctly once, but not consistently. Abliterated Deepseek models I’ve used in the past have been able to pass the test.

BaroqueInMind · edit-2 7 months ago

I can’t find any abliterated models of this new release that aren’t quantized to shit and are GGUF to work with my Ollama instance

DeepSeek just released updated r1 models with 'deeper and more complex reasoning patterns'. Includes a r1 distilled qwen3 8b model boasting "10% improved performance" over original

DeepSeek just released updated r1 models with 'deeper and more complex reasoning patterns'. Includes a r1 distilled qwen3 8b model boasting "10% improved performance" over original

deepseek-ai/DeepSeek-R1-0528-Qwen3-8B · Hugging Face