|
--- |
|
license: apache-2.0 |
|
tags: |
|
- trl |
|
- orpo |
|
- generated_from_trainer |
|
- exl2 |
|
base_model: mistral-community/Mixtral-8x22B-v0.1 |
|
datasets: |
|
- argilla/distilabel-capybara-dpo-7k-binarized |
|
model-index: |
|
- name: zephyr-orpo-141b-A35b-v0.1 |
|
results: [] |
|
--- |
|
|
|
<img src="https://huggingface.co/HuggingFaceH4/zephyr-orpo-141b-A35b-v0.1/resolve/main/logo.png" alt="Zephyr 141B Logo" width="400" style="margin-left:'auto' margin-right:'auto' display:'block'"/> |
|
|
|
# machinez/zephyr-orpo-141b-A35b-v0.1-exl2 |
|
This model was converted to EXL2 format from [`HuggingFaceH4/zephyr-orpo-141b-A35b-v0.1`](https://huggingface.co/HuggingFaceH4/zephyr-orpo-141b-A35b-v0.1). |
|
Refer to the [original model card](https://huggingface.co/HuggingFaceH4/zephyr-orpo-141b-A35b-v0.1) for more details on the model. |
|
|
|
Each branch contains an individual bits per weight, with the main one containing only the meaurement.json for further conversions. |
|
|
|
<a href="https://huggingface.co/machinez/zephyr-orpo-141b-A35b-v0.1-exl2/tree/1_5">1.5 bits per weight - Fits Dual RTX 3090/4090 or Triple Nvidia Tesla P100 16gb at 4k context</a> |
|
|
|
<a href="https://huggingface.co/machinez/zephyr-orpo-141b-A35b-v0.1-exl2/tree/2_75">2.75 bits per weight - Fits Quad Nvidia Tesla P100 16gb at 16k context</a> |
|
|
|
## Sample instructions to load in TabbyAPI @ 1.5bpw on 3x Nvidia Tesla P100 16gb at 4k context |
|
```JSON |
|
{ |
|
"name": "Machinez_zephyr-orpo-141b-A35b-v0.1_1.5bpw", |
|
"max_seq_len": 4096, |
|
"override_base_seq_len": 4096, |
|
"gpu_split_auto": false, |
|
"autosplit_reserve": [ |
|
96 |
|
], |
|
"gpu_split": [ |
|
14.15, |
|
14, |
|
15 |
|
], |
|
"rope_scale": 1, |
|
"rope_alpha": 1, |
|
"no_flash_attention": false, |
|
"cache_mode": "fp16", |
|
"prompt_template": "string", |
|
"num_experts_per_token": 0, |
|
"use_cfg": true, |
|
"fasttensors": false, |
|
"skip_queue": false |
|
} |
|
``` |
|
|
|
## Sample instructions to load in TabbyAPI @ 2.75bpw on 4x Nvidia Tesla P100 16gb at 16k context |
|
```JSON |
|
{ |
|
"name": "Machinez_zephyr-orpo-141b-A35b-v0.1_2.75bpw", |
|
"max_seq_len": 16384, |
|
"override_base_seq_len": 16384, |
|
"gpu_split_auto": false, |
|
"autosplit_reserve": [ |
|
96 |
|
], |
|
"gpu_split": [ |
|
12.5, |
|
13, |
|
13, |
|
16.1 |
|
], |
|
"rope_scale": 1, |
|
"rope_alpha": 1, |
|
"no_flash_attention": false, |
|
"cache_mode": "fp16", |
|
"prompt_template": "string", |
|
"num_experts_per_token": 0, |
|
"use_cfg": true, |
|
"fasttensors": false, |
|
"skip_queue": false |
|
} |
|
``` |
|
|
|
## Download instructions |
|
|
|
With git: |
|
|
|
```shell |
|
git clone --single-branch --branch 2_75 https://huggingface.co/machinez/zephyr-orpo-141b-A35b-v0.1-exl2 |
|
``` |
|
|
|
With huggingface hub (credit to TheBloke for instructions, borrowed from bartowski): |
|
|
|
```shell |
|
pip3 install -U "huggingface_hub[cli]" |
|
``` |
|
|
|
## (optional) |
|
```shell |
|
git config --global credential.helper 'store --file ~/.my-credentials' |
|
huggingface-cli login |
|
``` |
|
|
|
To download the `main` (only useful if you only care about measurement.json) branch to a folder called `zephyr-orpo-141b-A35b-v0.1-exl22`: |
|
|
|
```shell |
|
mkdir machinez_zephyr-orpo-141b-A35b-v0.1-exl2_2.75bpw |
|
huggingface-cli download machinez/zephyr-orpo-141b-A35b-v0.1-exl2 --local-dir machinez_zephyr-orpo-141b-A35b-v0.1-exl2_2.75bpw --local-dir-use-symlinks False |
|
``` |
|
|
|
To download from a different branch, add the `--revision` parameter: |
|
|
|
```shell |
|
mkdir machinez_zephyr-orpo-141b-A35b-v0.1-exl2_2.75bpw |
|
huggingface-cli download machinez/zephyr-orpo-141b-A35b-v0.1-exl2 --revision 2_75 --local-dir machinez_zephyr-orpo-141b-A35b-v0.1-exl2_2.75bpw --local-dir-use-symlinks False |
|
|