---
license: apache-2.0
tags:
- trl
- orpo
- generated_from_trainer
- exl2
base_model: mistral-community/Mixtral-8x22B-v0.1
datasets:
- argilla/distilabel-capybara-dpo-7k-binarized
model-index:
- name: zephyr-orpo-141b-A35b-v0.1
results: []
---
# machinez/zephyr-orpo-141b-A35b-v0.1-exl2
This model was converted to EXL2 format from [`HuggingFaceH4/zephyr-orpo-141b-A35b-v0.1`](https://huggingface.co/HuggingFaceH4/zephyr-orpo-141b-A35b-v0.1).
Refer to the [original model card](https://huggingface.co/HuggingFaceH4/zephyr-orpo-141b-A35b-v0.1) for more details on the model.
Each branch contains an individual bits per weight, with the main one containing only the meaurement.json for further conversions.
1.5 bits per weight - Fits Dual RTX 3090/4090 or Triple Nvidia Tesla P100 16gb at 4k context
2.75 bits per weight - Fits Quad Nvidia Tesla P100 16gb at 16k context
## Sample instructions to load in TabbyAPI @ 1.5bpw on 3x Nvidia Tesla P100 16gb at 4k context
```JSON
{
"name": "Machinez_zephyr-orpo-141b-A35b-v0.1_1.5bpw",
"max_seq_len": 4096,
"override_base_seq_len": 4096,
"gpu_split_auto": false,
"autosplit_reserve": [
96
],
"gpu_split": [
14.15,
14,
15
],
"rope_scale": 1,
"rope_alpha": 1,
"no_flash_attention": false,
"cache_mode": "fp16",
"prompt_template": "string",
"num_experts_per_token": 0,
"use_cfg": true,
"fasttensors": false,
"skip_queue": false
}
```
## Sample instructions to load in TabbyAPI @ 2.75bpw on 4x Nvidia Tesla P100 16gb at 16k context
```JSON
{
"name": "Machinez_zephyr-orpo-141b-A35b-v0.1_2.75bpw",
"max_seq_len": 16384,
"override_base_seq_len": 16384,
"gpu_split_auto": false,
"autosplit_reserve": [
96
],
"gpu_split": [
12.5,
13,
13,
16.1
],
"rope_scale": 1,
"rope_alpha": 1,
"no_flash_attention": false,
"cache_mode": "fp16",
"prompt_template": "string",
"num_experts_per_token": 0,
"use_cfg": true,
"fasttensors": false,
"skip_queue": false
}
```
## Download instructions
With git:
```shell
git clone --single-branch --branch 2_75 https://huggingface.co/machinez/zephyr-orpo-141b-A35b-v0.1-exl2
```
With huggingface hub (credit to TheBloke for instructions, borrowed from bartowski):
```shell
pip3 install -U "huggingface_hub[cli]"
```
## (optional)
```shell
git config --global credential.helper 'store --file ~/.my-credentials'
huggingface-cli login
```
To download the `main` (only useful if you only care about measurement.json) branch to a folder called `zephyr-orpo-141b-A35b-v0.1-exl22`:
```shell
mkdir machinez_zephyr-orpo-141b-A35b-v0.1-exl2_2.75bpw
huggingface-cli download machinez/zephyr-orpo-141b-A35b-v0.1-exl2 --local-dir machinez_zephyr-orpo-141b-A35b-v0.1-exl2_2.75bpw --local-dir-use-symlinks False
```
To download from a different branch, add the `--revision` parameter:
```shell
mkdir machinez_zephyr-orpo-141b-A35b-v0.1-exl2_2.75bpw
huggingface-cli download machinez/zephyr-orpo-141b-A35b-v0.1-exl2 --revision 2_75 --local-dir machinez_zephyr-orpo-141b-A35b-v0.1-exl2_2.75bpw --local-dir-use-symlinks False