Crataco
/

Pygmalion-1.3B-GGML

English

ggml

text generation

conversational

Model card Files Files and versions Community

Crataco commited on Sep 24, 2023

Commit

716d6ff

•

1 Parent(s): 917311a

Update README.md

Browse files

Files changed (1) hide show

README.md +33 -11

README.md CHANGED Viewed

@@ -2,7 +2,8 @@
 license: agpl-3.0
 language:
 - en
-thumbnail:
 tags:
 - ggml
 - text generation
@@ -10,19 +11,40 @@ tags:
 inference: false
 ---
-*(Not to be confused with [Pygmalion 13B](https://huggingface.co/TehVenom/Pygmalion-13b-GGML).)*
-This is converted and quantized from [Pygmalion 1.3B](https://huggingface.co/PygmalionAI/pygmalion-1.3b), based on [an earlier version](https://huggingface.co/EleutherAI/pythia-1.4b-deduped-v0) of Pythia 1.4B Deduped.
-Notes:
-- Converted with ggerganov/ggml's gpt-neox conversion script, and tested with KoboldCpp.
-- I can't promise that this will work with other frontends, if at all. I've had problems with the tokenizer. Could be related to the ggml implementation of GPT-NeoX [(source)](https://github.com/ggerganov/ggml/tree/master/examples/gpt-neox#notes).
-### RAM USAGE (on KoboldCpp w/ OpenBLAS)
-Model | Initial RAM
-:--:|:--:
-ggml-pygmalion-1.3b-q4_0.bin | 1.1 GiB
-ggml-pygmalion-1.3b-q5_1.bin | 1.3 GiB
 Below is the original model card for Pygmalion 1.3B.

 license: agpl-3.0
 language:
 - en
+model_creator: PygmalionAI
+quantized_by: Crataco
 tags:
 - ggml
 - text generation
 inference: false
 ---
+*(Not to be confused with [Pygmalion 13B](https://huggingface.co/TehVenom/Pygmalion-13b-GGML) and [Pygmalion 2 13B](https://huggingface.co/TheBloke/Pygmalion-2-13B-GGUF).)*
+# Pygmalion 1.3B GGML
+### This repository contains quantized conversions of the Pygmalion 1.3B checkpoint.
+*For use with frontends that support GGML quantized GPT-NeoX models, such as KoboldCpp and Oobabooga (with the CTransformers loader).*
+*Last updated on 2023-09-23.*
+Model | Startup RAM usage (KoboldCpp) | Startup RAM usage (Oobabooga)
+:--:|:--:|:--:
+pygmalion-1.3b.q4_0.bin | 1.0 GiB | 1.3 GiB
+pygmalion-1.3b.q4_1.bin | 1.1 GiB | 1.4 GiB
+pygmalion-1.3b.q5_0.bin | 1.2 GiB | 1.5 GiB
+pygmalion-1.3b.q5_1.bin | 1.3 GiB | 1.6 GiB
+pygmalion-1.3b.q8_0.bin | 1.7 GiB | 2.0 GiB
+pygmalion-1.3b.f16.bin | 2.9 GiB | 3.2 GiB
+**Recommended settings:**
+Pygmalion 1.3B is a limited model, left in the dust by the Pygmalion project's advancements since then. Which is a shame, as it remains one of the few conversational models available for systems with less than 2GB RAM, at least before we get [TinyLLaMA](https://github.com/jzhang38/TinyLlama) and quantized [Phi-1.5](https://huggingface.co/microsoft/phi-1_5).
+Here are some tips to get the best results you can out of this model:
+- Stick to a low temperature, preferably between 0.2 and 0.7.
+- Keep your repetition penalty between 1.0 and 1.02. These tiny values are required for models based on Pythia Deduped. Any higher and you'll get [this]().
+- If using SillyTavern, follow these settings:
+![image/png](https://cdn-uploads.huggingface.co/production/uploads/6251b9851842c08ef3111c4f/Yqvgv428hA9V67jC9VZTp.png)
+- You also have to keep character descriptions to a few sentences, possibly following CharacterAI's 500-character descriptions.
+**Notes:**
+- KoboldCpp [[bfc696f]](https://github.com/LostRuins/koboldcpp/tree/bfc696fcc452975dbe8967c39301ba856d04a030) was tested without OpenBLAS.
+- Oobabooga [[895ec9d]](https://github.com/oobabooga/text-generation-webui/tree/895ec9dadb96120e8202a83052bf9032ca3245ae) was tested with with the `--model <model> --loader ctransformers --model_type gptneox` launch arguments.
+- ggerganov/ggml [[8ca2c19]](https://github.com/ggerganov/ggml/tree/8ca2c19a3bb8622954d858fbf6383522684eaf34) was used for conversion and quantization.
+- The original model is available at [PygmalionAI/pygmalion-1.3b](https://huggingface.co/PygmalionAI/pygmalion-1.3b).
+- Earlier ggmlv2 quantizations are available [here](https://huggingface.co/Crataco/Pygmalion-1.3B-GGML/tree/15d3aa5e07372e4200c598443d211a2976db47f9).
 Below is the original model card for Pygmalion 1.3B.