cgus
/

Text Generation
Transformers
mistral
conversational
cgus commited on
Commit
d5ce910
1 Parent(s): 4582f3e

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +30 -0
README.md CHANGED
@@ -2,9 +2,39 @@
2
  license: cc-by-nc-4.0
3
  datasets:
4
  - victunes/nart-100k-synthetic-buddy-mixed-names
 
 
5
  ---
6
  **GGUF:** https://huggingface.co/victunes/TherapyBeagle-11B-v2-GGUF
7
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
8
  # TherapyBeagle 11B v2
9
 
10
  _Buddy is here for {{user}}._
 
2
  license: cc-by-nc-4.0
3
  datasets:
4
  - victunes/nart-100k-synthetic-buddy-mixed-names
5
+ base_model: victunes/TherapyBeagle-11B-v2
6
+ inference: false
7
  ---
8
  **GGUF:** https://huggingface.co/victunes/TherapyBeagle-11B-v2-GGUF
9
 
10
+ # TherapyBeagle-11B-v2-exl2
11
+ Original model: [TherapyBeagle-11B-v2](https://huggingface.co/victunes/TherapyBeagle-11B-v2)
12
+ Model creator: [victunes](https://huggingface.co/victunes)
13
+
14
+ ## Quants
15
+ [4bpw h6](https://huggingface.co/cgus/TherapyBeagle-11B-v2-exl2/tree/main)
16
+ [4.25bpw h6](https://huggingface.co/cgus/TherapyBeagle-11B-v2-exl2/tree/4.25bpw-h6)
17
+ [4.65bpw h6](https://huggingface.co/cgus/TherapyBeagle-11B-v2-exl2/tree/4.65bpw-h6)
18
+ [5bpw h6](https://huggingface.co/cgus/TherapyBeagle-11B-v2-exl2/tree/5bpw-h6)
19
+ [6bpw h6](https://huggingface.co/cgus/TherapyBeagle-11B-v2-exl2/tree/6bpw-h6)
20
+ [8bpw h8](https://huggingface.co/cgus/TherapyBeagle-11B-v2-exl2/tree/8bpw-h8)
21
+
22
+ ## Quantization notes
23
+ Made with exllamav2 0.0.18 with the default dataset.
24
+ Original BF16 .bin files were converted to FP16 safetensors.
25
+ When I compared 4bpw quants made from BF16 and FP16, there was about 0.1% quality loss for FP16.
26
+ I picked FP16 version because resulted files had fast loading times when version made from BF16 loaded about 100s slower.
27
+ Quantization metadata was removed from config.json to fix loading the model with some old Text-Generation-WebUI versions.
28
+
29
+ ## How to run
30
+ This quantization method uses GPU and requires Exllamav2 loader which can be found in following applications:
31
+
32
+ [Text Generation Webui](https://github.com/oobabooga/text-generation-webui)
33
+ [KoboldAI](https://github.com/henk717/KoboldAI)
34
+ [ExUI](https://github.com/turboderp/exui)
35
+ [lollms-webui](https://github.com/ParisNeo/lollms-webui)
36
+
37
+ # Original model card
38
  # TherapyBeagle 11B v2
39
 
40
  _Buddy is here for {{user}}._