metadata

license:
  - cc-by-nc-4.0
  - llama2
language:
  - en
library_name: ExLlamaV2
pipeline_tag: text-generation
tags:
  - Mytho
  - ReMM
  - LLaMA 2
  - Quantized Model
  - exl2
base_model:
  - Undi95/ReMM-v2.2-L2-13B

exl2 quants for ReMM V2.2

This repository includes the quantized models for the ReMM V2.2 model by Undi. ReMM is a model merge attempting to recreate MythoMax using the SLERP merging method and newer models.

Current models

exl2 Quant	Model Branch	Model Size	Minimum Recommended VRAM (4096 Context, fp16 cache)	BPW
3-Bit	main	5.44 GB	8GB GPU	3.14
3-Bit	3bit	6.36 GB	10GB GPU	3.72
4-Bit	4bit	7.13 GB	12GB GPU (10GB with swap)	4.2
4-Bit	4.6bit	7.81 GB	12GB GPU	4.63
5-Bit	Orang Baik's Repo	8.96 GB	16GB GPU (12GB with swap)	5.33

Where to use

There are a couple places you can use an exl2 model, here are a few:

oobabooga's Text Gen Webui
- When using the downloader, make sure to format like this: Anthonyg5005/ReMM-v2.2-L2-13B-exl2:QuantBranch
- With 5-Bit download: R136a1/ReMM-v2.2-L2-13B-exl2
tabbyAPI
ExUI
KoboldAI (Clone repo, don't use snapshot)

WARNING

Model cannot be used commercially due to the Alpaca dataset license. Only use this model for research purposes or personal use.