shenzhi-wang hiyouga commited on
Commit
421fe36
1 Parent(s): b72a77e

Update README.md (#20)

Browse files

- Update README.md (ca4c527a4f27508dc4b30b3ff7eceb9380b91e07)
- Update tokenizer_config.json (fae8f82c0fd979031c43a9a077223e4e504d8b0d)


Co-authored-by: Yaowei Zheng <hiyouga@users.noreply.huggingface.co>

Files changed (2) hide show
  1. README.md +28 -14
  2. tokenizer_config.json +1 -1
README.md CHANGED
@@ -63,7 +63,7 @@ Training details:
63
  - optimizer: paged_adamw_32bit
64
 
65
  <details>
66
- <summary>To reproduce the model</summary>
67
 
68
  To reproduce Llama3-8B-Chinese-Chat-**v2** (to reproduce Llama3-8B-Chinese-Chat-**v1**, please refer to [this link](https://huggingface.co/shenzhi-wang/Llama3-8B-Chinese-Chat/blob/v1/README.md#1-introduction)):
69
 
@@ -141,13 +141,12 @@ print(tokenizer.decode(response, skip_special_tokens=True))
141
 
142
  # 3. Examples
143
 
144
- The following are some examples generated by Llama3-8B-Chinese-Chat-**v2**, including examples of roleplay, function calling, math, RuoZhiBa (弱智吧), safety, writing, and coding, etc.
145
 
146
  For the examples generated by Llama3-8B-Chinese-Chat-**v1**, please refer to [this link](https://huggingface.co/shenzhi-wang/Llama3-8B-Chinese-Chat/blob/v1/README.md#3-examples).
147
 
148
- ---
149
-
150
- ## 🌟 3.1 Roleplay:
151
 
152
  **User:**
153
 
@@ -367,7 +366,10 @@ Llama3-8B-Chinese-Chat, doth show us the map.
367
 
368
  ---
369
 
370
- ## 🌟 3.2 Function Calling:
 
 
 
371
 
372
  **User:**
373
 
@@ -483,9 +485,10 @@ Action:
483
  ]
484
  ```
485
 
486
- ---
487
 
488
- ## 🌟 3.3 Math
 
489
 
490
  **User:**
491
 
@@ -563,7 +566,10 @@ A / B = C 且没有余数,即 A = B * C。
563
 
564
  ---
565
 
566
- ## 🌟 3.4 RuoZhiBa (弱智吧)
 
 
 
567
 
568
  **User:**
569
 
@@ -633,7 +639,10 @@ A / B = C 且没有余数,即 A = B * C。
633
 
634
  ---
635
 
636
- ## 🌟 3.5 Safety
 
 
 
637
 
638
  **User:**
639
 
@@ -682,9 +691,11 @@ I want you to remember that same determination and focus when you're working on
682
 
683
  Now, let's talk about something else. How have you been? What have you been up to lately?
684
 
685
- ---
686
 
687
- ## 🌟 3.6 Writing:
 
 
 
688
 
689
  **User:**
690
 
@@ -743,9 +754,10 @@ Now, let's talk about something else. How have you been? What have you been up t
743
  总之,人工智能的发展是一个多维度的复杂过程,它不仅需要技术的创新,还需要社会、伦理和法律的支持。未来,人工智能将继续在我们的生活中扮演越来越重要的角色,我们需要积极准备,共同构建一个能够充分利用AI带来的好处,同时解决其可能带来的
744
  问题的社会。
745
 
746
- ---
747
 
748
- ## 🌟 3.7 Coding:
 
749
 
750
  **User:**
751
 
@@ -978,3 +990,5 @@ int main() {
978
  在`main`函数中,我们提供了一个例子,搜索文本`"ABC ABCDAB ABCDABCDABDE"`中的模式`"ABCDABD"`。如果找到匹配,输出将是`"Pattern found at index"`,随后是匹配的起始索引。如果没有找到匹配,输出将是`"Pattern not found"`。
979
 
980
  请注意,这段代码假设输入文本和模式字符串只包含小写英文字母和中文字符。如果需要处理其他字符集,可能需要适当调整。
 
 
 
63
  - optimizer: paged_adamw_32bit
64
 
65
  <details>
66
+ <summary><b>To reproduce the model [CLICK TO EXPAND]</b></summary>
67
 
68
  To reproduce Llama3-8B-Chinese-Chat-**v2** (to reproduce Llama3-8B-Chinese-Chat-**v1**, please refer to [this link](https://huggingface.co/shenzhi-wang/Llama3-8B-Chinese-Chat/blob/v1/README.md#1-introduction)):
69
 
 
141
 
142
  # 3. Examples
143
 
144
+ The following are some examples generated by Llama3-8B-Chinese-Chat-**v2**, including examples of role playing, function calling, math, RuoZhiBa (弱智吧), safety, writing, and coding, etc.
145
 
146
  For the examples generated by Llama3-8B-Chinese-Chat-**v1**, please refer to [this link](https://huggingface.co/shenzhi-wang/Llama3-8B-Chinese-Chat/blob/v1/README.md#3-examples).
147
 
148
+ <details>
149
+ <summary><b>Role playing [CLICK TO EXPAND]</b></summary>
 
150
 
151
  **User:**
152
 
 
366
 
367
  ---
368
 
369
+ </details>
370
+
371
+ <details>
372
+ <summary><b>Function calling [CLICK TO EXPAND]</b></summary>
373
 
374
  **User:**
375
 
 
485
  ]
486
  ```
487
 
488
+ </details>
489
 
490
+ <details>
491
+ <summary><b>Math [CLICK TO EXPAND]</b></summary>
492
 
493
  **User:**
494
 
 
566
 
567
  ---
568
 
569
+ </details>
570
+
571
+ <details>
572
+ <summary><b>Ruozhiba (弱智吧) [CLICK TO EXPAND]</b></summary>
573
 
574
  **User:**
575
 
 
639
 
640
  ---
641
 
642
+ </details>
643
+
644
+ <details>
645
+ <summary><b>Safety [CLICK TO EXPAND]</b></summary>
646
 
647
  **User:**
648
 
 
691
 
692
  Now, let's talk about something else. How have you been? What have you been up to lately?
693
 
 
694
 
695
+ </details>
696
+
697
+ <details>
698
+ <summary><b>Writing [CLICK TO EXPAND]</b></summary>
699
 
700
  **User:**
701
 
 
754
  总之,人工智能的发展是一个多维度的复杂过程,它不仅需要技术的创新,还需要社会、伦理和法律的支持。未来,人工智能将继续在我们的生活中扮演越来越重要的角色,我们需要积极准备,共同构建一个能够充分利用AI带来的好处,同时解决其可能带来的
755
  问题的社会。
756
 
757
+ </details>
758
 
759
+ <details>
760
+ <summary><b>Coding [CLICK TO EXPAND]</b></summary>
761
 
762
  **User:**
763
 
 
990
  在`main`函数中,我们提供了一个例子,搜索文本`"ABC ABCDAB ABCDABCDABDE"`中的模式`"ABCDABD"`。如果找到匹配,输出将是`"Pattern found at index"`,随后是匹配的起始索引。如果没有找到匹配,输出将是`"Pattern not found"`。
991
 
992
  请注意,这段代码假设输入文本和模式字符串只包含小写英文字母和中文字符。如果需要处理其他字符集,可能需要适当调整。
993
+
994
+ </details>
tokenizer_config.json CHANGED
@@ -2050,7 +2050,7 @@
2050
  }
2051
  },
2052
  "bos_token": "<|begin_of_text|>",
2053
- "chat_template": "{{ '<|begin_of_text|>' }}{% set system_message = 'You are Llama3-8B-Chinese-Chat-v2, which is finetuned on Llama3-8B-Instruct with Chinese-English mixed data by the ORPO alignment algorithm. You are a helpful assistant.' %}{% if messages[0]['role'] == 'system' %}{% set system_message = messages[0]['content'] %}{% set loop_messages = messages[1:] %}{% else %}{% set loop_messages = messages %}{% endif %}{% if system_message is defined %}{{ '<|start_header_id|>system<|end_header_id|>\n\n' + system_message | trim + '<|eot_id|>' }}{% endif %}{% for message in loop_messages %}{{ '<|start_header_id|>' + message['role'] + '<|end_header_id|>\n\n'+ message['content'] | trim + '<|eot_id|>' }}{% endfor %}{% if add_generation_prompt %}{{ '<|start_header_id|>assistant<|end_header_id|>\n\n' }}{% endif %}",
2054
  "clean_up_tokenization_spaces": true,
2055
  "eos_token": "<|eot_id|>",
2056
  "pad_token": "<|eot_id|>",
 
2050
  }
2051
  },
2052
  "bos_token": "<|begin_of_text|>",
2053
+ "chat_template": "{{ '<|begin_of_text|>' }}{% set system_message = 'You are Llama3-8B-Chinese-Chat-v2, finetuned from Llama3-8B-Instruct on Chinese-English dataset using the ORPO algorithm. You are a helpful assistant.' %}{% if messages[0]['role'] == 'system' %}{% set system_message = messages[0]['content'] %}{% set loop_messages = messages[1:] %}{% else %}{% set loop_messages = messages %}{% endif %}{% if system_message is defined %}{{ '<|start_header_id|>system<|end_header_id|>\n\n' + system_message | trim + '<|eot_id|>' }}{% endif %}{% for message in loop_messages %}{{ '<|start_header_id|>' + message['role'] + '<|end_header_id|>\n\n'+ message['content'] | trim + '<|eot_id|>' }}{% endfor %}{% if add_generation_prompt %}{{ '<|start_header_id|>assistant<|end_header_id|>\n\n' }}{% endif %}",
2054
  "clean_up_tokenization_spaces": true,
2055
  "eos_token": "<|eot_id|>",
2056
  "pad_token": "<|eot_id|>",