Starting a Mistral Megathread to aggregate resources.
This is my new favorite 7B model. It is really good for what it is. I am excited to see what we can tune together. I will be using this thread as a living document, expect a lot of changes and notes, revisions and updates.
Let me know if there’s something in particular you want to see here. I will be adding to this thread throughout my fine-tuning journey with Mistral.
Mistral Model Megathread
Key
Link #1 - Base Model
Link #2 - Instruct Model
Quantized Base Models from TheBloke
GPTQ
- https://huggingface.co/TheBloke/Mistral-7B-v0.1-GPTQ
- https://huggingface.co/TheBloke/Mistral-7B-Instruct-v0.1-GPTQ
GGUF
- https://huggingface.co/TheBloke/Mistral-7B-v0.1-GGUF
- https://huggingface.co/TheBloke/Mistral-7B-Instruct-v0.1-GGUF
AWQ
- https://huggingface.co/TheBloke/Mistral-7B-v0.1-AWQ
- https://huggingface.co/TheBloke/Mistral-7B-Instruct-v0.1-AWQ
Quantized Samantha Models from TheBloke
GPTQ
- https://huggingface.co/TheBloke/samantha-mistral-7B-GPTQ
- https://huggingface.co/TheBloke/samantha-mistral-instruct-7B-GPTQ
GGUF
- https://huggingface.co/TheBloke/samantha-mistral-7B-GGUF
- https://huggingface.co/TheBloke/samantha-mistral-instruct-7B-GGUF
AWQ
- https://huggingface.co/TheBloke/samantha-mistral-7B-AWQ
- https://huggingface.co/TheBloke/samantha-mistral-instruct-7B-AWQ
Quantized Kimiko Models from TheBloke
GPTQ
GGUF
AWQ
Quantized Dolphin Models from TheBloke
GPTQ
GGUF
AWQ
Quantized Orca Models from TheBloke
GPTQ
GGUF
AWQ
Quantized Airoboros Models from TheBloke
GPTQ
GGUF
AWQ
If you like to run any of the quantized/optimized models from TheBloke, do visit the full model pages from each of the quantized model cards to see and support the developers of each fine-tuned model.
- Mistral - Mistral.ai
- Mistral Samantha - Eric Hartford
- Mistral Kimiko - nRuaif
- Mistral Dolphin - Eric Hartford
- Mistral OpenOrca - OpenOrca/Alignment Lab
- Mistral Airoboros - teknium
Looks like an interesting model. I couldn’t find it on their website, do you know what the training data was for this model?
https://mistral.ai/news/announcing-mistral-7b/
They don’t publish the training dataset. It’s a secret. There are open bugreports on their Github, HuggingFace #8, #10, #38 and i think someone said so explicitly on their Discord.