GGUF
Japanese
English
qwen
tianyuz commited on
Commit
61a0c7f
1 Parent(s): 430118e

Upload folder using huggingface_hub

Browse files
Files changed (4) hide show
  1. .gitattributes +1 -0
  2. README.md +66 -0
  3. nekomata-7b-instruction.Q4_K_M.gguf +3 -0
  4. rinna.png +0 -0
.gitattributes CHANGED
@@ -33,3 +33,4 @@ saved_model/**/* filter=lfs diff=lfs merge=lfs -text
33
  *.zip filter=lfs diff=lfs merge=lfs -text
34
  *.zst filter=lfs diff=lfs merge=lfs -text
35
  *tfevents* filter=lfs diff=lfs merge=lfs -text
 
 
33
  *.zip filter=lfs diff=lfs merge=lfs -text
34
  *.zst filter=lfs diff=lfs merge=lfs -text
35
  *tfevents* filter=lfs diff=lfs merge=lfs -text
36
+ nekomata-7b-instruction.Q4_K_M.gguf filter=lfs diff=lfs merge=lfs -text
README.md ADDED
@@ -0,0 +1,66 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ thumbnail: https://github.com/rinnakk/japanese-pretrained-models/blob/master/rinna.png
3
+ language:
4
+ - ja
5
+ - en
6
+ tags:
7
+ - qwen
8
+ inference: false
9
+ ---
10
+
11
+ # `rinna/nekomata-7b-instruction-gguf`
12
+
13
+ ![rinna-icon](./rinna.png)
14
+
15
+ # Overview
16
+ The model is the GGUF version of [`rinna/nekomata-7b-instruction`](https://huggingface.co/rinna/nekomata-7b-instruction). It can be used with [llama.cpp](https://github.com/ggerganov/llama.cpp) for lightweight inference.
17
+
18
+ Quantization of this model may cause stability issue in GPTQ, AWQ and GGUF q4_0. We recommend **GGUF q4_K_M** for 4-bit quantization.
19
+
20
+ See [`rinna/nekomata-7b-instruction`](https://huggingface.co/rinna/nekomata-7b-instruction) for details about model architecture and data.
21
+
22
+ * **Authors**
23
+
24
+ - [Toshiaki Wakatsuki](https://huggingface.co/t-w)
25
+ - [Tianyu Zhao](https://huggingface.co/tianyuz)
26
+ - [Kei Sawada](https://huggingface.co/keisawada)
27
+
28
+ ---
29
+
30
+ # How to use the model
31
+
32
+ See [llama.cpp](https://github.com/ggerganov/llama.cpp) for more usage details.
33
+
34
+ ~~~~bash
35
+ git clone https://github.com/ggerganov/llama.cpp
36
+ cd llama.cpp
37
+ make
38
+
39
+ MODEL_PATH=/path/to/nekomata-7b-instruction-gguf/nekomata-7b-instruction.Q4_K_M.gguf
40
+ MAX_N_TOKENS=512
41
+ PROMPT_INSTRUCTION="次の日本語を英語に翻訳してください。"
42
+ PROMPT_INPUT="大規模言語モデル(だいきぼげんごモデル、英: large language model、LLM)は、多数のパラメータ(数千万から数十億)を持つ人工ニューラルネットワークで構成されるコンピュータ言語モデルで、膨大なラベルなしテキストを使用して自己教師あり学習または半教師あり学習によって訓練が行われる。"
43
+ PROMPT="以下は、タスクを説明する指示と、文脈のある入力の組み合わせです。要求を適切に満たす応答を書きなさい。\n\n### 指示:\n${PROMPT_INSTRUCTION}\n\n### 入力:\n${PROMPT_INPUT}\n\n### 応答:\n"
44
+
45
+ ./main -m ${MODEL_PATH} -n ${MAX_N_TOKENS} -p "${PROMPT}"
46
+ ~~~~
47
+
48
+ ---
49
+
50
+ # Tokenization
51
+ Please refer to [`rinna/nekomata-7b`](https://huggingface.co/rinna/nekomata-7b) for tokenization details.
52
+
53
+ ---
54
+
55
+ # How to cite
56
+ ~~~
57
+ @misc{RinnaNekomata7bInstructionGGUF,
58
+ url={https://huggingface.co/rinna/nekomata-7b-instruction-gguf},
59
+ title={rinna/nekomata-7b-instruction-gguf},
60
+ author={Wakatsuki, Toshiaki and Zhao, Tianyu and Sawada, Kei}
61
+ }
62
+ ~~~
63
+ ---
64
+
65
+ # License
66
+ [Tongyi Qianwen LICENSE AGREEMENT](https://github.com/QwenLM/Qwen/blob/main/Tongyi%20Qianwen%20LICENSE%20AGREEMENT)
nekomata-7b-instruction.Q4_K_M.gguf ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:d6dfc7e9d38bc0a4ae87fdf4480cf94ce1ebbc57af575b509cd84767ad3c5b6f
3
+ size 4899217600
rinna.png ADDED