GGUF
Japanese
English
qwen
tianyuz commited on
Commit
02fdd73
1 Parent(s): fad6c1e

Upload folder using huggingface_hub

Browse files
Files changed (4) hide show
  1. .gitattributes +1 -0
  2. README.md +64 -0
  3. nekomata-7b.Q4_K_M.gguf +3 -0
  4. rinna.png +0 -0
.gitattributes CHANGED
@@ -33,3 +33,4 @@ saved_model/**/* filter=lfs diff=lfs merge=lfs -text
33
  *.zip filter=lfs diff=lfs merge=lfs -text
34
  *.zst filter=lfs diff=lfs merge=lfs -text
35
  *tfevents* filter=lfs diff=lfs merge=lfs -text
 
 
33
  *.zip filter=lfs diff=lfs merge=lfs -text
34
  *.zst filter=lfs diff=lfs merge=lfs -text
35
  *tfevents* filter=lfs diff=lfs merge=lfs -text
36
+ nekomata-7b.Q4_K_M.gguf filter=lfs diff=lfs merge=lfs -text
README.md ADDED
@@ -0,0 +1,64 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ thumbnail: https://github.com/rinnakk/japanese-pretrained-models/blob/master/rinna.png
3
+ language:
4
+ - ja
5
+ - en
6
+ tags:
7
+ - qwen
8
+ inference: false
9
+ ---
10
+
11
+ # `rinna/nekomata-7b-gguf`
12
+
13
+ ![rinna-icon](./rinna.png)
14
+
15
+ # Overview
16
+ The model is the GGUF version of [`rinna/nekomata-7b`](https://huggingface.co/rinna/nekomata-7b). It can be used with [llama.cpp](https://github.com/ggerganov/llama.cpp) for lightweight inference.
17
+
18
+ Quantization of this model may cause stability issue in GPTQ, AWQ and GGUF q4_0. We recommend **GGUF q4_K_M** for 4-bit quantization.
19
+
20
+ See [`rinna/nekomata-7b`](https://huggingface.co/rinna/nekomata-7b) for details about model architecture and data.
21
+
22
+ * **Authors**
23
+
24
+ - [Toshiaki Wakatsuki](https://huggingface.co/t-w)
25
+ - [Tianyu Zhao](https://huggingface.co/tianyuz)
26
+ - [Kei Sawada](https://huggingface.co/keisawada)
27
+
28
+ ---
29
+
30
+ # How to use the model
31
+
32
+ See [llama.cpp](https://github.com/ggerganov/llama.cpp) for more usage details.
33
+
34
+ ~~~~bash
35
+ git clone https://github.com/ggerganov/llama.cpp
36
+ cd llama.cpp
37
+ make
38
+
39
+ MODEL_PATH=/path/to/nekomata-7b-gguf/nekomata-7b.Q4_K_M.gguf
40
+ MAX_N_TOKENS=128
41
+ PROMPT="西田幾多郎は、"
42
+
43
+ ./main -m ${MODEL_PATH} -n ${MAX_N_TOKENS} -p "${PROMPT}"
44
+ ~~~~
45
+
46
+ ---
47
+
48
+ # Tokenization
49
+ Please refer to [`rinna/nekomata-7b`](https://huggingface.co/rinna/nekomata-7b) for tokenization details.
50
+
51
+ ---
52
+
53
+ # How to cite
54
+ ~~~
55
+ @misc{RinnaNekomata7bGGUF,
56
+ url={https://huggingface.co/rinna/nekomata-7b-gguf},
57
+ title={rinna/nekomata-7b-gguf},
58
+ author={Wakatsuki, Toshiaki and Zhao, Tianyu and Sawada, Kei}
59
+ }
60
+ ~~~
61
+ ---
62
+
63
+ # License
64
+ [Tongyi Qianwen LICENSE AGREEMENT](https://github.com/QwenLM/Qwen/blob/main/Tongyi%20Qianwen%20LICENSE%20AGREEMENT)
nekomata-7b.Q4_K_M.gguf ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:c9fda77da34e1093449299c7d3cf56ac71feaf3c80f264b7acf4b5846590987b
3
+ size 4899217600
rinna.png ADDED