Update README.md
Browse files
README.md
CHANGED
@@ -2,7 +2,8 @@
|
|
2 |
license: agpl-3.0
|
3 |
language:
|
4 |
- en
|
5 |
-
|
|
|
6 |
tags:
|
7 |
- ggml
|
8 |
- text generation
|
@@ -10,19 +11,40 @@ tags:
|
|
10 |
inference: false
|
11 |
---
|
12 |
|
13 |
-
*(Not to be confused with [Pygmalion 13B](https://huggingface.co/TehVenom/Pygmalion-13b-GGML).)*
|
14 |
|
15 |
-
|
|
|
|
|
16 |
|
17 |
-
|
18 |
-
- Converted with ggerganov/ggml's gpt-neox conversion script, and tested with KoboldCpp.
|
19 |
-
- I can't promise that this will work with other frontends, if at all. I've had problems with the tokenizer. Could be related to the ggml implementation of GPT-NeoX [(source)](https://github.com/ggerganov/ggml/tree/master/examples/gpt-neox#notes).
|
20 |
|
21 |
-
|
22 |
-
|
23 |
-
|
24 |
-
|
25 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
26 |
|
27 |
Below is the original model card for Pygmalion 1.3B.
|
28 |
|
|
|
2 |
license: agpl-3.0
|
3 |
language:
|
4 |
- en
|
5 |
+
model_creator: PygmalionAI
|
6 |
+
quantized_by: Crataco
|
7 |
tags:
|
8 |
- ggml
|
9 |
- text generation
|
|
|
11 |
inference: false
|
12 |
---
|
13 |
|
14 |
+
*(Not to be confused with [Pygmalion 13B](https://huggingface.co/TehVenom/Pygmalion-13b-GGML) and [Pygmalion 2 13B](https://huggingface.co/TheBloke/Pygmalion-2-13B-GGUF).)*
|
15 |
|
16 |
+
# Pygmalion 1.3B GGML
|
17 |
+
### This repository contains quantized conversions of the Pygmalion 1.3B checkpoint.
|
18 |
+
*For use with frontends that support GGML quantized GPT-NeoX models, such as KoboldCpp and Oobabooga (with the CTransformers loader).*
|
19 |
|
20 |
+
*Last updated on 2023-09-23.*
|
|
|
|
|
21 |
|
22 |
+
Model | Startup RAM usage (KoboldCpp) | Startup RAM usage (Oobabooga)
|
23 |
+
:--:|:--:|:--:
|
24 |
+
pygmalion-1.3b.q4_0.bin | 1.0 GiB | 1.3 GiB
|
25 |
+
pygmalion-1.3b.q4_1.bin | 1.1 GiB | 1.4 GiB
|
26 |
+
pygmalion-1.3b.q5_0.bin | 1.2 GiB | 1.5 GiB
|
27 |
+
pygmalion-1.3b.q5_1.bin | 1.3 GiB | 1.6 GiB
|
28 |
+
pygmalion-1.3b.q8_0.bin | 1.7 GiB | 2.0 GiB
|
29 |
+
pygmalion-1.3b.f16.bin | 2.9 GiB | 3.2 GiB
|
30 |
+
|
31 |
+
**Recommended settings:**
|
32 |
+
|
33 |
+
Pygmalion 1.3B is a limited model, left in the dust by the Pygmalion project's advancements since then. Which is a shame, as it remains one of the few conversational models available for systems with less than 2GB RAM, at least before we get [TinyLLaMA](https://github.com/jzhang38/TinyLlama) and quantized [Phi-1.5](https://huggingface.co/microsoft/phi-1_5).
|
34 |
+
|
35 |
+
Here are some tips to get the best results you can out of this model:
|
36 |
+
- Stick to a low temperature, preferably between 0.2 and 0.7.
|
37 |
+
- Keep your repetition penalty between 1.0 and 1.02. These tiny values are required for models based on Pythia Deduped. Any higher and you'll get [this]().
|
38 |
+
- If using SillyTavern, follow these settings:
|
39 |
+
![image/png](https://cdn-uploads.huggingface.co/production/uploads/6251b9851842c08ef3111c4f/Yqvgv428hA9V67jC9VZTp.png)
|
40 |
+
- You also have to keep character descriptions to a few sentences, possibly following CharacterAI's 500-character descriptions.
|
41 |
+
|
42 |
+
**Notes:**
|
43 |
+
- KoboldCpp [[bfc696f]](https://github.com/LostRuins/koboldcpp/tree/bfc696fcc452975dbe8967c39301ba856d04a030) was tested without OpenBLAS.
|
44 |
+
- Oobabooga [[895ec9d]](https://github.com/oobabooga/text-generation-webui/tree/895ec9dadb96120e8202a83052bf9032ca3245ae) was tested with with the `--model <model> --loader ctransformers --model_type gptneox` launch arguments.
|
45 |
+
- ggerganov/ggml [[8ca2c19]](https://github.com/ggerganov/ggml/tree/8ca2c19a3bb8622954d858fbf6383522684eaf34) was used for conversion and quantization.
|
46 |
+
- The original model is available at [PygmalionAI/pygmalion-1.3b](https://huggingface.co/PygmalionAI/pygmalion-1.3b).
|
47 |
+
- Earlier ggmlv2 quantizations are available [here](https://huggingface.co/Crataco/Pygmalion-1.3B-GGML/tree/15d3aa5e07372e4200c598443d211a2976db47f9).
|
48 |
|
49 |
Below is the original model card for Pygmalion 1.3B.
|
50 |
|