qwp4w3hyb commited on
Commit
888495e
1 Parent(s): cf0ad10

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +9 -10
README.md CHANGED
@@ -18,19 +18,18 @@ tags:
18
 
19
  # Quant Infos
20
 
 
 
 
21
  - quants done with an importance matrix for improved quantization loss
22
- - quantized & generated imatrix from the f32 as f16 is inaccurate when converting from bf16
23
  - K & IQ quants in basically all variants from Q6_K down to IQ1_S
 
 
 
 
 
 
24
 
25
- Quantized with [llama.cpp](https://github.com/ggerganov/llama.cpp) commit [b4e4b8a9351d918a56831c73cf9f25c1837b80d1](https://github.com/ggerganov/llama.cpp/commit/b4e4b8a9351d918a56831c73cf9f25c1837b80d1) (master from 2024-04-24)
26
-
27
- Imatrix dataset was used from [here](https://github.com/ggerganov/llama.cpp/discussions/5263#discussioncomment-8395384)
28
-
29
- Using this command to generate the importance matrix from the f32.gguf
30
-
31
- ```
32
- ./imatrix -c 512 -m $model_name-f16.gguf -f $llama_cpp_path/groups_merged.txt -o $out_path/imat-f16-gmerged.dat
33
- ```
34
 
35
  # Original Model Card
36
 
 
18
 
19
  # Quant Infos
20
 
21
+ ## Includes latest bpe tokenizer fixes 🎉
22
+
23
+ - Updated for latest bpe pre-tokenizer fixes https://github.com/ggerganov/llama.cpp/pull/6920
24
  - quants done with an importance matrix for improved quantization loss
 
25
  - K & IQ quants in basically all variants from Q6_K down to IQ1_S
26
+ - fixed end token for instruct mode (<|eot_id|>[128009])
27
+ - Quantized with [llama.cpp](https://github.com/ggerganov/llama.cpp) commit [f4ab2a41476600a98067a9474ea8f9e6db41bcfa](https://github.com/ggerganov/llama.cpp/commit/f4ab2a41476600a98067a9474ea8f9e6db41bcfa) (master from 2024-04-29)
28
+ - Imatrtix generated with [this](https://github.com/ggerganov/llama.cpp/discussions/5263#discussioncomment-8395384) dataset.
29
+ ```
30
+ ./imatrix -c 512 -m $model_name-f16.gguf -f $llama_cpp_path/groups_merged.txt -o $out_path/imat-f16-gmerged.dat
31
+ ```
32
 
 
 
 
 
 
 
 
 
 
33
 
34
  # Original Model Card
35