dahara1 commited on
Commit
c1b63f5
1 Parent(s): 3105ad4

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +3 -3
README.md CHANGED
@@ -10,15 +10,15 @@ original model [weblab-10b-instruction-sft](https://huggingface.co/matsuo-lab/we
10
  This model is A quantized(miniaturized) version of the original model(21.42GB).
11
 
12
  There are currently two well-known quantization methods.
13
- (1)GPTQ(This model. 6.3 GB)
14
  The size is smaller and the execution speed is faster, but the inference performance may be a little worse than original model.
15
  At least one GPU is currently required due to a limitation of the Accelerate library.
16
  So this model cannot be run with the huggingface space free version.
17
  You need autoGPTQ library to use this model.
18
 
19
- (2)gguf([matsuolab-weblab-10b-instruction-sft-gguf](https://huggingface.co/mmnga/matsuolab-weblab-10b-instruction-sft-gguf) 6.03GB) created by mmnga.
20
  You can use gguf model with llama.cpp at cpu only machine.
21
- but maybe little bit slower then GPTQ especialy long text.
22
 
23
 
24
  ### sample code
 
10
  This model is A quantized(miniaturized) version of the original model(21.42GB).
11
 
12
  There are currently two well-known quantization methods.
13
+ (1)GPTQ model(This model. 6.3 GB)
14
  The size is smaller and the execution speed is faster, but the inference performance may be a little worse than original model.
15
  At least one GPU is currently required due to a limitation of the Accelerate library.
16
  So this model cannot be run with the huggingface space free version.
17
  You need autoGPTQ library to use this model.
18
 
19
+ (2)gguf model([matsuolab-weblab-10b-instruction-sft-gguf](https://huggingface.co/mmnga/matsuolab-weblab-10b-instruction-sft-gguf) 6.03GB) created by mmnga.
20
  You can use gguf model with llama.cpp at cpu only machine.
21
+ But maybe gguf model little bit slower then GPTQ especialy long text.
22
 
23
 
24
  ### sample code