Anthonyg5005's picture
Update README.md
ef7625f verified
|
raw
history blame
1.79 kB
metadata
license:
  - cc-by-nc-4.0
  - llama2
language:
  - en
library_name: ExLlamaV2
pipeline_tag: text-generation
tags:
  - Mytho
  - ReMM
  - LLaMA 2
  - Quantized Model
  - exl2
base_model:
  - Undi95/ReMM-v2.2-L2-13B

exl2 quants for ReMM V2.2

This repository includes the quantized models for the ReMM V2.2 model by Undi. ReMM is a model merge attempting to recreate MythoMax using the SLERP merging method and newer models.

Current models

exl2 Quant Model Branch Model Size Minimum Recommended VRAM (4096 Context, fp16 cache) BPW
3-Bit main 5.44 GB 8GB GPU 3.14
3-Bit 3bit 6.36 GB 10GB GPU 3.72
4-Bit 4bit 7.13 GB 12GB GPU (10GB with swap) 4.2
4-Bit 4.6bit 7.81 GB 12GB GPU 4.63
5-Bit Orang Baik's Repo 8.96 GB 16GB GPU (12GB with swap) 5.33

Where to use

There are a couple places you can use an exl2 model, here are a few:

WARNING

Model cannot be used commercially due to the Alpaca dataset license. Only use this model for research purposes or personal use.