ProphetOfBostrom commited on
Commit
5e16f7a
1 Parent(s): b87d0c4
Files changed (1) hide show
  1. README.md +20 -7
README.md CHANGED
@@ -4,19 +4,32 @@ license_name: yi-license
4
  license_link: https://huggingface.co/01-ai/Yi-34B-200K/blob/main/LICENSE
5
  ---
6
  nvm looks like the tokenizer on the source file's broken anyway. probably the base model too. loves `</s>` for some reason but Yi doesn't use that?
7
- no downloads, make it yourself; the imatrix works. i'm deadass sick of this. do people not test these things?
 
8
  ---
9
- this file was made by concatenating most of the [default exllamav2 calibration data](https://github.com/turboderp/exllamav2/tree/master/conversion/standard_cal_data). a 900kb file of coherent text only, with some formatting and code but no endless broken html tags or nonsense. includes multilingual, for those deep layers.
10
- artefact produced from:
11
  ```
12
  $ cd exllamav2/conversion/standard_cal_data
13
  $ cat technical.utf8 multilingual.utf8 code.utf8 tiny.utf8 > techmulcodetiny.utf8
14
  ```
15
- where: [exllamav2/conversion/standard_cal_data](https://github.com/turboderp/exllamav2/tree/master/conversion/standard_cal_data) and [techmulcodetiny.utf8](./techmulcodetiny.utf8) produce a file that is used by imatrix for 560~ "chunks"
16
 
17
- imatrix run with default sampling settings besides the dataset (i think? i increased the batch number and reduced the batch size so i could cram on more layers but the generation should have been the same in the end)
18
  (someone tell me why I was wrong to run imatrix with -cb continuous batching. shame me.)
19
 
 
 
 
 
 
 
 
 
 
 
 
 
20
  # Downloads (eat my ass huggingface yeah just leave the cryptic git lfs error message on the far side of a 3 hour upload over LTE thanks)
21
- no downloads anymore. ive uploaded 50 gigabytes so far and fucking none of them made it past the great wall of git-lfs
22
- you have the imatrix and the q6, do it yourself. IQ2_M probably for a 24G IQ3XXS for better with kv offload
 
4
  license_link: https://huggingface.co/01-ai/Yi-34B-200K/blob/main/LICENSE
5
  ---
6
  nvm looks like the tokenizer on the source file's broken anyway. probably the base model too. loves `</s>` for some reason but Yi doesn't use that?
7
+
8
+ no downloads, make it yourself; the imatrix works. i'm feeling very irritable now. do people not test these things? I know git-lfs hasn't been subject to any QA ever so ?
9
  ---
10
+ the dataset file was made by concatenating most of the [default exllamav2 calibration data](https://github.com/turboderp/exllamav2/tree/master/conversion/standard_cal_data). a 900kb file of coherent text only, with some formatting and code but no endless broken html tags or nonsense. includes multilingual, for those deep layers.
11
+ like this:
12
  ```
13
  $ cd exllamav2/conversion/standard_cal_data
14
  $ cat technical.utf8 multilingual.utf8 code.utf8 tiny.utf8 > techmulcodetiny.utf8
15
  ```
16
+ reference to: [exllamav2/conversion/standard_cal_data](https://github.com/turboderp/exllamav2/tree/master/conversion/standard_cal_data) and [techmulcodetiny.utf8](./techmulcodetiny.utf8) produce a file that is used by imatrix for 560~ "chunks"
17
 
18
+ imatrix was run with default sampling settings besides the dataset (i think? i increased the batch number and reduced the batch size so i could cram on more layers but the generation should have been the same in the end)
19
  (someone tell me why I was wrong to run imatrix with -cb continuous batching. shame me.)
20
 
21
+ how-to because i'm grouchy but I did actually want people to have these. Remember to replace IQ2_M (appears only twice, near the end) with whatever you fancy. Q2_K might be more compatible.
22
+ ```
23
+ ~]$ git clone https://github.com/ggerganov/llama.cpp
24
+ ~]$ cd llama.cpp
25
+ if you're like me and you break llamas for fun and don't understand cmake: git switch master && git pull; git restore Makefile
26
+ otherwise
27
+ llama.cpp]$ git pull; make -j
28
+ llama.cpp]$ ./quantize --allow-requantize --imatrix Kyllene-57B-v1.0.q6_K.gguf.imatrix INPUT_DIRECTORY/Kyllene-57B-v1.0.q6_K.gguf Kyllene-57B-v1.0.IQ2_M.gguf IQ2_M
29
+ ```
30
+
31
+ if your computer has less than 8 cores, add the number of cores to the end of this (there's an invisible 8 by default). and yes, you can just use ./ (llama.cpp) as INPUT_DIRECTORY
32
+
33
  # Downloads (eat my ass huggingface yeah just leave the cryptic git lfs error message on the far side of a 3 hour upload over LTE thanks)
34
+ no downloads now. ive uploaded 50 gigabytes so far and none of them made it past the great wall of git-lfs
35
+ you have the imatrix and the q6, DIY. IQ2_M probably for a 24GB device, IQ3XXS for better with kv offload.