metadata
license:
- cc-by-nc-4.0
- llama2
language:
- en
library_name: ExLlamaV2
pipeline_tag: text-generation
tags:
- Mytho
- ReMM
- LLaMA 2
- Quantized Model
- exl2
base_model:
- Undi95/ReMM-v2.2-L2-13B
exl2 quants for ReMM V2.2
This repository includes the quantized models for the ReMM V2.2 model by Undi. ReMM is a model merge attempting to recreate MythoMax using the SLERP merging method and newer models.
Current models
exl2 Quant | Model Branch | Model Size | Minimum Recommended VRAM (4096 Context, fp16 cache) | BPW |
---|---|---|---|---|
3-Bit | main | 5.44 GB | 8GB GPU | 3.14 |
3-Bit | 3bit | 6.36 GB | 10GB GPU | 3.72 |
4-Bit | 4bit | 7.13 GB | 12GB GPU (10GB with swap) | 4.2 |
4-Bit | 4.6bit | 7.81 GB | 12GB GPU | 4.63 |
5-Bit | Orang Baik's Repo | 8.96 GB | 16GB GPU (12GB with swap) | 5.33 |
Where to use
There are a couple places you can use an exl2 model, here are a few:
- oobabooga's Text Gen Webui
- When using the downloader, make sure to format like this: Anthonyg5005/ReMM-v2.2-L2-13B-exl2:QuantBranch
- With 5-Bit download: R136a1/ReMM-v2.2-L2-13B-exl2
- tabbyAPI
- ExUI
- KoboldAI (Clone repo, don't use snapshot)
WARNING
Model cannot be used commercially due to the Alpaca dataset license. Only use this model for research purposes or personal use.