OllieStanley commited on
Commit
d82939e
1 Parent(s): 6ffcd7e

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +14 -14
README.md CHANGED
@@ -10,7 +10,7 @@ Thanks to Mick for writing the `xor_codec.py` script which enables this process
10
 
11
  ## The Process
12
 
13
- Note: This process applies to `oasst-sft-6-llama-30b` model. The same process can be applied to other models in future, but the checksums will be different..
14
 
15
  To use OpenAssistant LLaMa-Based Models, you need to have a copy of the original LLaMa model weights and add them to a `llama` subdirectory here.
16
 
@@ -54,7 +54,7 @@ edd1a5897748864768b1fab645b31491 ./tokenizer_config.json
54
  Once you have LLaMa weights in the correct format, you can apply the XOR decoding:
55
 
56
  ```
57
- python xor_codec.py oasst-sft-6-llama-30b/ oasst-sft-6-llama-30b-xor/ llama30b_hf/
58
  ```
59
 
60
  You should expect to see one warning message during execution:
@@ -63,24 +63,24 @@ You should expect to see one warning message during execution:
63
 
64
  This is normal. If similar messages appear for other files, something has gone wrong.
65
 
66
- Now run `find -type f -exec md5sum "{}" + > checklist.chk` in the output directory (here `oasst-sft-6-llama-30b`). You should get a file with exactly these contents:
67
 
68
  ```
69
- 970e99665d66ba3fad6fdf9b4910acc5 ./pytorch_model-00007-of-00007.bin
70
- 659fcb7598dcd22e7d008189ecb2bb42 ./pytorch_model-00003-of-00007.bin
71
- ff6e4cf43ddf02fb5d3960f850af1220 ./pytorch_model-00001-of-00007.bin
72
  27b0dc092f99aa2efaf467b2d8026c3f ./added_tokens.json
73
- aee09e21813368c49baaece120125ae3 ./generation_config.json
74
- 740c324ae65b1ec25976643cda79e479 ./pytorch_model-00005-of-00007.bin
75
- f7aefb4c63be2ac512fd905b45295235 ./pytorch_model-00004-of-00007.bin
76
  eeec4125e9c7560836b4873b6f8e3025 ./tokenizer.model
77
- 369df2f0e38bda0d9629a12a77c10dfc ./pytorch_model-00006-of-00007.bin
78
- 27b9c7c8c62db80e92de14724f4950f3 ./config.json
79
  deb33dd4ffc3d2baddcce275a00b7c1b ./tokenizer.json
80
- 76d47e4f51a8df1d703c6f594981fcab ./pytorch_model.bin.index.json
81
  ed59bfee4e87b9193fea5897d610ab24 ./tokenizer_config.json
82
- 130f5e690becc2223f59384887c2a505 ./special_tokens_map.json
83
- ae48c4c68e4e171d502dd0896aa19a84 ./pytorch_model-00002-of-00007.bin
84
  ```
85
 
86
  If so you have successfully decoded the weights and should be able to use the model with HuggingFace Transformers.
 
10
 
11
  ## The Process
12
 
13
+ Note: This process applies to `oasst-rlhf-2-llama-30b-7k-steps` model. The same process can be applied to other models in future, but the checksums will be different..
14
 
15
  To use OpenAssistant LLaMa-Based Models, you need to have a copy of the original LLaMa model weights and add them to a `llama` subdirectory here.
16
 
 
54
  Once you have LLaMa weights in the correct format, you can apply the XOR decoding:
55
 
56
  ```
57
+ python xor_codec.py oasst-rlhf-2-llama-30b-7k-steps/ oasst-rlhf-2-llama-30b-7k-steps-xor/ llama30b_hf/
58
  ```
59
 
60
  You should expect to see one warning message during execution:
 
63
 
64
  This is normal. If similar messages appear for other files, something has gone wrong.
65
 
66
+ Now run `find -type f -exec md5sum "{}" + > checklist.chk` in the output directory (here `oasst-rlhf-2-llama-30b-7k-steps`). You should get a file with exactly these contents:
67
 
68
  ```
69
+ d08594778f00abe70b93899628e41246 ./pytorch_model-00007-of-00007.bin
70
+ f11acc069334434d68c45a80ee899fe5 ./pytorch_model-00003-of-00007.bin
71
+ 9f41bd4d5720d28567b3e7820b4a8023 ./pytorch_model-00001-of-00007.bin
72
  27b0dc092f99aa2efaf467b2d8026c3f ./added_tokens.json
73
+ 148bfd184af630a7633b4de2f41bfc49 ./generation_config.json
74
+ b6e90377103e9270cbe46b13aed288ec ./pytorch_model-00005-of-00007.bin
75
+ 4c5941b4ee12dc0d8e6b5ca3f6819f4d ./pytorch_model-00004-of-00007.bin
76
  eeec4125e9c7560836b4873b6f8e3025 ./tokenizer.model
77
+ 2c92d306969c427275f34b4ebf66f087 ./pytorch_model-00006-of-00007.bin
78
+ 9a4d2468ecf85bf07420b200faefb4af ./config.json
79
  deb33dd4ffc3d2baddcce275a00b7c1b ./tokenizer.json
80
+ 13a3641423840eb89f9a86507a90b2bf ./pytorch_model.bin.index.json
81
  ed59bfee4e87b9193fea5897d610ab24 ./tokenizer_config.json
82
+ 704373f0c0d62be75e5f7d41d39a7e57 ./special_tokens_map.json
83
+ ed991042b2a449123824f689bb94b29e ./pytorch_model-00002-of-00007.bin
84
  ```
85
 
86
  If so you have successfully decoded the weights and should be able to use the model with HuggingFace Transformers.