mistral-nemo-gutenberg-12B-v4-exl2
This repository contains various EXL2 quantisations of nbeerbower/mistral-nemo-gutenberg-12B-v4.
Quantisations available:
Branch | Description | Recommended |
---|---|---|
2.0-bpw | 2 bits per weight | Low Quality - Smallest Available Quantisation |
3.0-bpw | 3 bits per weight | |
4.0-bpw | 4 bits per weight | ✔️ - Recommended for Low-VRAM Environments |
5.0-bpw | 5 bits per weight | |
6.0-bpw | 6 bits per weight | ✔️ - Best Quality / VRAM Balance |
6.5-bpw | 6.5 bits per weight | ✔️ - Near Perfect Quality, Slightly Higher VRAM Usage |
8.0-bpw | 8.0 bits per weight | Best Available Quality - Almost always unnecessary |
ORIGINAL README:
TheDrummer/Rocinante-12B-v1 finetuned on jondurbin/gutenberg-dpo-v0.1.
Method
Finetuned using an A100 on Google Colab for 3 epochs.
- Downloads last month
- 33
Inference Providers
NEW
This model isn't deployed by any Inference Provider.
🙋
Ask for provider support
Model tree for CameronRedmore/mistral-nemo-gutenberg-12B-v4-exl2
Base model
TheDrummer/Rocinante-12B-v1