mabaochang commited on
Commit
732daf4
1 Parent(s): 0ce6066

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +31 -4
README.md CHANGED
@@ -1,5 +1,5 @@
1
  ---
2
- license: other
3
  tags:
4
  - text2text-generation
5
  pipeline_tag: text2text-generation
@@ -53,12 +53,39 @@ c066b68b4139328e87a694020fc3a6c3 ./special_tokens_map.json.ca3d163bab0553818272
53
  39ec1b33fbf9a0934a8ae0f9a24c7163 ./tokenizer.model.9e556afd44213b6bd1be2b850ebbbd98f5481437a8021afaf58ee7fb1818d347.enc
54
  ```
55
 
56
- 2. Decrypt the files using https://github.com/LianjiaTech/BELLE/tree/main/models#使用说明
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
57
  ```
58
- for f in "encrypted"/*; do if [ -f "$f" ]; then python3 decrypt.py "$f" "original/7B/consolidated.00.pth" "result/"; fi; done
 
 
 
 
 
 
 
59
  ```
60
 
61
  3. Check md5sum
 
 
 
62
  ```
63
  md5sum ./*
64
  a57bf2d0d7ec2590740bc4175262610b ./config.json
@@ -87,7 +114,7 @@ After you decrypt the files, BELLE-LLAMA-7B-2M can be easily loaded with LlamaFo
87
  from transformers import LlamaForCausalLM, AutoTokenizer
88
  import torch
89
 
90
- ckpt = './result/'
91
  device = torch.device('cuda')
92
  model = LlamaForCausalLM.from_pretrained(ckpt, device_map='auto', low_cpu_mem_usage=True)
93
  tokenizer = AutoTokenizer.from_pretrained(ckpt)
 
1
  ---
2
+ license: gpl-3.0
3
  tags:
4
  - text2text-generation
5
  pipeline_tag: text2text-generation
 
53
  39ec1b33fbf9a0934a8ae0f9a24c7163 ./tokenizer.model.9e556afd44213b6bd1be2b850ebbbd98f5481437a8021afaf58ee7fb1818d347.enc
54
  ```
55
 
56
+ 2. Decrypt the files using the scripts in https://github.com/LianjiaTech/BELLE/tree/main/models
57
+
58
+ You can use the following command in Bash.
59
+ Please replace "/path/to_encrypted" with the path where you stored your encrypted file,
60
+ replace "/path/to_original_llama_7B" with the path where you stored your original llama7B file,
61
+ and replace "/path/to_finetuned_model" with the path where you want to save your final trained model.
62
+
63
+ ```bash
64
+ mkdir /path/to_finetuned_model
65
+ for f in "/path/to_encrypted"/*; \
66
+ do if [ -f "$f" ]; then \
67
+ python3 decrypt.py "$f" "/path/to_original_llama_7B/consolidated.00.pth" "/path/to_finetuned_model/"; \
68
+ fi; \
69
+ done
70
+ ```
71
+
72
+ After executing the aforementioned command, you will obtain the following files.
73
+
74
  ```
75
+ ./config.json
76
+ ./generation_config.json
77
+ ./pytorch_model-00001-of-00002.bin
78
+ ./pytorch_model-00002-of-00002.bin
79
+ ./pytorch_model.bin.index.json
80
+ ./special_tokens_map.json
81
+ ./tokenizer_config.json
82
+ ./tokenizer.model
83
  ```
84
 
85
  3. Check md5sum
86
+
87
+ You can verify the integrity of these files by performing an MD5 checksum to ensure their complete recovery.
88
+ Here are the MD5 checksums for the relevant files:
89
  ```
90
  md5sum ./*
91
  a57bf2d0d7ec2590740bc4175262610b ./config.json
 
114
  from transformers import LlamaForCausalLM, AutoTokenizer
115
  import torch
116
 
117
+ ckpt = '/path/to_finetuned_model/'
118
  device = torch.device('cuda')
119
  model = LlamaForCausalLM.from_pretrained(ckpt, device_map='auto', low_cpu_mem_usage=True)
120
  tokenizer = AutoTokenizer.from_pretrained(ckpt)