Upload README.md
Browse files
README.md
CHANGED
@@ -7,7 +7,7 @@ language:
|
|
7 |
- en
|
8 |
library_name: transformers
|
9 |
license: apache-2.0
|
10 |
-
model_creator:
|
11 |
model_name: OpenInstruct Mistral 7B
|
12 |
model_type: mistral
|
13 |
pipeline_tag: text-generation
|
@@ -45,13 +45,13 @@ quantized_by: TheBloke
|
|
45 |
<!-- header end -->
|
46 |
|
47 |
# OpenInstruct Mistral 7B - GPTQ
|
48 |
-
- Model creator: [
|
49 |
- Original model: [OpenInstruct Mistral 7B](https://huggingface.co/monology/openinstruct-mistral-7b)
|
50 |
|
51 |
<!-- description start -->
|
52 |
# Description
|
53 |
|
54 |
-
This repo contains GPTQ model files for [
|
55 |
|
56 |
Multiple GPTQ parameter permutations are provided; see Provided Files below for details of the options provided, their parameters, and the software used to create them.
|
57 |
|
@@ -64,7 +64,7 @@ These files were quantised using hardware kindly provided by [Massed Compute](ht
|
|
64 |
* [AWQ model(s) for GPU inference.](https://huggingface.co/TheBloke/openinstruct-mistral-7B-AWQ)
|
65 |
* [GPTQ models for GPU inference, with multiple quantisation parameter options.](https://huggingface.co/TheBloke/openinstruct-mistral-7B-GPTQ)
|
66 |
* [2, 3, 4, 5, 6 and 8-bit GGUF models for CPU+GPU inference](https://huggingface.co/TheBloke/openinstruct-mistral-7B-GGUF)
|
67 |
-
* [
|
68 |
<!-- repositories-available end -->
|
69 |
|
70 |
<!-- prompt-template start -->
|
@@ -121,12 +121,12 @@ Most GPTQ files are made with AutoGPTQ. Mistral models are currently made with T
|
|
121 |
|
122 |
| Branch | Bits | GS | Act Order | Damp % | GPTQ Dataset | Seq Len | Size | ExLlama | Desc |
|
123 |
| ------ | ---- | -- | --------- | ------ | ------------ | ------- | ---- | ------- | ---- |
|
124 |
-
| [main](https://huggingface.co/TheBloke/openinstruct-mistral-7B-GPTQ/tree/main) | 4 | 128 | Yes | 0.1 | [
|
125 |
-
| [gptq-4bit-32g-actorder_True](https://huggingface.co/TheBloke/openinstruct-mistral-7B-GPTQ/tree/gptq-4bit-32g-actorder_True) | 4 | 32 | Yes | 0.1 | [
|
126 |
-
| [gptq-8bit--1g-actorder_True](https://huggingface.co/TheBloke/openinstruct-mistral-7B-GPTQ/tree/gptq-8bit--1g-actorder_True) | 8 | None | Yes | 0.1 | [
|
127 |
-
| [gptq-8bit-128g-actorder_True](https://huggingface.co/TheBloke/openinstruct-mistral-7B-GPTQ/tree/gptq-8bit-128g-actorder_True) | 8 | 128 | Yes | 0.1 | [
|
128 |
-
| [gptq-8bit-32g-actorder_True](https://huggingface.co/TheBloke/openinstruct-mistral-7B-GPTQ/tree/gptq-8bit-32g-actorder_True) | 8 | 32 | Yes | 0.1 | [
|
129 |
-
| [gptq-4bit-64g-actorder_True](https://huggingface.co/TheBloke/openinstruct-mistral-7B-GPTQ/tree/gptq-4bit-64g-actorder_True) | 4 | 64 | Yes | 0.1 | [
|
130 |
|
131 |
<!-- README_GPTQ.md-provided-files end -->
|
132 |
|
@@ -384,7 +384,7 @@ And thank you again to a16z for their generous grant.
|
|
384 |
|
385 |
<!-- footer end -->
|
386 |
|
387 |
-
# Original model card:
|
388 |
|
389 |
|
390 |
# OpenInstruct Mistral-7B
|
@@ -395,7 +395,7 @@ Quantized to FP16 and released under the [Apache-2.0](https://choosealicense.com
|
|
395 |
Compute generously provided by [Higgsfield AI](https://higgsfield.ai/model/655559e6b5777dab620095e0).
|
396 |
|
397 |
|
398 |
-
Prompt format: Alpaca
|
399 |
```
|
400 |
Below is an instruction that describes a task. Write a response that appropriately completes the request.
|
401 |
|
@@ -405,4 +405,10 @@ Below is an instruction that describes a task. Write a response that appropriate
|
|
405 |
### Response:
|
406 |
```
|
407 |
|
|
|
|
|
|
|
|
|
|
|
|
|
408 |
\*as of 21 Nov 2023. "commercially-usable" includes both an open-source base model and a *non-synthetic* open-source finetune dataset. updated leaderboard results available [here](https://huggingfaceh4-open-llm-leaderboard.hf.space).
|
|
|
7 |
- en
|
8 |
library_name: transformers
|
9 |
license: apache-2.0
|
10 |
+
model_creator: Devin Gulliver
|
11 |
model_name: OpenInstruct Mistral 7B
|
12 |
model_type: mistral
|
13 |
pipeline_tag: text-generation
|
|
|
45 |
<!-- header end -->
|
46 |
|
47 |
# OpenInstruct Mistral 7B - GPTQ
|
48 |
+
- Model creator: [Devin Gulliver](https://huggingface.co/monology)
|
49 |
- Original model: [OpenInstruct Mistral 7B](https://huggingface.co/monology/openinstruct-mistral-7b)
|
50 |
|
51 |
<!-- description start -->
|
52 |
# Description
|
53 |
|
54 |
+
This repo contains GPTQ model files for [Devin Gulliver's OpenInstruct Mistral 7B](https://huggingface.co/monology/openinstruct-mistral-7b).
|
55 |
|
56 |
Multiple GPTQ parameter permutations are provided; see Provided Files below for details of the options provided, their parameters, and the software used to create them.
|
57 |
|
|
|
64 |
* [AWQ model(s) for GPU inference.](https://huggingface.co/TheBloke/openinstruct-mistral-7B-AWQ)
|
65 |
* [GPTQ models for GPU inference, with multiple quantisation parameter options.](https://huggingface.co/TheBloke/openinstruct-mistral-7B-GPTQ)
|
66 |
* [2, 3, 4, 5, 6 and 8-bit GGUF models for CPU+GPU inference](https://huggingface.co/TheBloke/openinstruct-mistral-7B-GGUF)
|
67 |
+
* [Devin Gulliver's original unquantised fp16 model in pytorch format, for GPU inference and for further conversions](https://huggingface.co/monology/openinstruct-mistral-7b)
|
68 |
<!-- repositories-available end -->
|
69 |
|
70 |
<!-- prompt-template start -->
|
|
|
121 |
|
122 |
| Branch | Bits | GS | Act Order | Damp % | GPTQ Dataset | Seq Len | Size | ExLlama | Desc |
|
123 |
| ------ | ---- | -- | --------- | ------ | ------------ | ------- | ---- | ------- | ---- |
|
124 |
+
| [main](https://huggingface.co/TheBloke/openinstruct-mistral-7B-GPTQ/tree/main) | 4 | 128 | Yes | 0.1 | [VMware Open Instruct](https://huggingface.co/datasets/VMware/open-instruct/viewer/) | 4096 | 4.16 GB | Yes | 4-bit, with Act Order and group size 128g. Uses even less VRAM than 64g, but with slightly lower accuracy. |
|
125 |
+
| [gptq-4bit-32g-actorder_True](https://huggingface.co/TheBloke/openinstruct-mistral-7B-GPTQ/tree/gptq-4bit-32g-actorder_True) | 4 | 32 | Yes | 0.1 | [VMware Open Instruct](https://huggingface.co/datasets/VMware/open-instruct/viewer/) | 4096 | 4.57 GB | Yes | 4-bit, with Act Order and group size 32g. Gives highest possible inference quality, with maximum VRAM usage. |
|
126 |
+
| [gptq-8bit--1g-actorder_True](https://huggingface.co/TheBloke/openinstruct-mistral-7B-GPTQ/tree/gptq-8bit--1g-actorder_True) | 8 | None | Yes | 0.1 | [VMware Open Instruct](https://huggingface.co/datasets/VMware/open-instruct/viewer/) | 4096 | 7.52 GB | No | 8-bit, with Act Order. No group size, to lower VRAM requirements. |
|
127 |
+
| [gptq-8bit-128g-actorder_True](https://huggingface.co/TheBloke/openinstruct-mistral-7B-GPTQ/tree/gptq-8bit-128g-actorder_True) | 8 | 128 | Yes | 0.1 | [VMware Open Instruct](https://huggingface.co/datasets/VMware/open-instruct/viewer/) | 4096 | 7.68 GB | No | 8-bit, with group size 128g for higher inference quality and with Act Order for even higher accuracy. |
|
128 |
+
| [gptq-8bit-32g-actorder_True](https://huggingface.co/TheBloke/openinstruct-mistral-7B-GPTQ/tree/gptq-8bit-32g-actorder_True) | 8 | 32 | Yes | 0.1 | [VMware Open Instruct](https://huggingface.co/datasets/VMware/open-instruct/viewer/) | 4096 | 8.17 GB | No | 8-bit, with group size 32g and Act Order for maximum inference quality. |
|
129 |
+
| [gptq-4bit-64g-actorder_True](https://huggingface.co/TheBloke/openinstruct-mistral-7B-GPTQ/tree/gptq-4bit-64g-actorder_True) | 4 | 64 | Yes | 0.1 | [VMware Open Instruct](https://huggingface.co/datasets/VMware/open-instruct/viewer/) | 4096 | 4.29 GB | Yes | 4-bit, with Act Order and group size 64g. Uses less VRAM than 32g, but with slightly lower accuracy. |
|
130 |
|
131 |
<!-- README_GPTQ.md-provided-files end -->
|
132 |
|
|
|
384 |
|
385 |
<!-- footer end -->
|
386 |
|
387 |
+
# Original model card: Devin Gulliver's OpenInstruct Mistral 7B
|
388 |
|
389 |
|
390 |
# OpenInstruct Mistral-7B
|
|
|
395 |
Compute generously provided by [Higgsfield AI](https://higgsfield.ai/model/655559e6b5777dab620095e0).
|
396 |
|
397 |
|
398 |
+
## Prompt format: Alpaca
|
399 |
```
|
400 |
Below is an instruction that describes a task. Write a response that appropriately completes the request.
|
401 |
|
|
|
405 |
### Response:
|
406 |
```
|
407 |
|
408 |
+
## Recommended preset:
|
409 |
+
- temperature: 0.2
|
410 |
+
- top_k: 50
|
411 |
+
- top_p 0.95
|
412 |
+
- repetition_penalty: 1.1
|
413 |
+
|
414 |
\*as of 21 Nov 2023. "commercially-usable" includes both an open-source base model and a *non-synthetic* open-source finetune dataset. updated leaderboard results available [here](https://huggingfaceh4-open-llm-leaderboard.hf.space).
|