Introducing Mistral Small 4 | Mistral AI

ikt@aussie.zone · 10 hours ago

Introducing Mistral Small 4 | Mistral AI

fubarx@lemmy.world · 6 hours ago

At this point, these small models should add explicit minimum hardware requirements just so they can stand out. STM32 w xxGB of PSRAM. Android phone w this much RAM, how many TOPS, and minimum OS version. ESP32-S3 or S4? That sort of thing.

If you just say ‘small,’ you get lost in the noise.

ikt@aussie.zone · edit-2 5 hours ago

tbh that’s the main thing I took away from this, since when did small equal 119b ?!

Does that mean they’ve got large models lined up approaching 1tb?

fubarx@lemmy.world · 4 hours ago

Cloud-based LLMs have been commodotized. Lots of options.

There’s room for someone to lead the local on-device space. Anything from a high-end workstation (Apple Studio, Nvidia DGX Spark, AMD Strix) to laptop (MBPro, Windows AI) down to embedded (Qualcomm, STM32) and ultra-small (ESP32, ARM/RISC).

Lots of room there and no clear winners. Mistral, at this point could focus on the other tiers, make a name, and carve a lot of mindshare.

panda_abyss@lemmy.ca · edit-2 7 hours ago

Looks a little underwhelming with Qwen3.5 and Haiku beating it.

However, 6B active parameters and it’s trained to return short results could make this useful as a Qwen for local model. I’ve overall found Mistral models to be better to discuss with, but also the devstral small models were kinda janky last I used them (stuff like infinite loops and getting confused by less common programming languages). Qwen models are by far the most verbose out of the box, and happily burn a ton of tokens on useless thought. It’s an over-emphasis on reinforcement learning.

Also weird they use GPT 4.1 as the judge model. That’s a year old model, not nearly SOTA, and IIRC underwhelmed on most metrics. So it feels like a poor candidate judge.

Edit: we have a GPT5 – some of the charts are labelled wrong

Not mentioned in the blog post, but on HF: they created a small speculative decoding model go with it – https://huggingface.co/mistralai/Mistral-Small-4-119B-2603-eagle

That should accelerate inference speeds on some setups.

MalReynolds@slrpnk.net · 5 hours ago

For certain values of small…

That said, Mistral is strong in world knowledge and something this big is likely quite so. The 6B experts can fit in reasonable amounts of system RAM (Q4_K_M is ~ 72 GB so it’d likely run reasonably in 64 GB system RAM and 24 GB VRAM) and run at reasonable if not spectacular speeds, speculative decoding could help too (but that eagle is 392MB, which is scary tiny).