Anthonyg5005's picture
Update README.md
ada1508
|
raw
history blame
1.77 kB
metadata
datasets:
  - c-s-ale/alpaca-gpt4-data
  - Open-Orca/OpenOrca
  - Intel/orca_dpo_pairs
  - allenai/ultrafeedback_binarized_cleaned
  - HuggingFaceH4/no_robots
license: cc-by-nc-4.0
language:
  - en
library_name: ExLlamaV2
pipeline_tag: text-generation
tags:
  - Mistral
  - SOLAR
  - Quantized Model
  - exl2
base_model:
  - rishiraj/meow

exl2 quants for meow

This repository includes the quantized models for the meow model by Rishiraj Acharya. meow is a fine-tune of SOLAR-10.7B-Instruct-v1.0 with the no_robots dataset.

Current models

exl2 BPW Model Branch Model Size Minimum VRAM (4096 Context)
2-Bit main 3.28 GB 6GB GPU
4-Bit 4bit 5.61 GB 8GB GPU
5-Bit 5bit 6.92 GB 10GB GPU, 8GB with swap
6-Bit 6bit 8.23 GB 10GB GPU
8-Bit 8bit 10.84 GB 12GB GPU

Note

Using a 12GB Nvidia GeForce RTX 3060 I got on average around 20 tokens per second on the 8-bit quant in full 4096 context.

Where to use

There are a couple places you can use an exl2 model, here are a few:

WARNING

Model cannot be used commercially due to the Alpaca dataset license. Only use this model for research purposes or personal use.