Commit
•
421fe36
1
Parent(s):
b72a77e
Update README.md (#20)
Browse files- Update README.md (ca4c527a4f27508dc4b30b3ff7eceb9380b91e07)
- Update tokenizer_config.json (fae8f82c0fd979031c43a9a077223e4e504d8b0d)
Co-authored-by: Yaowei Zheng <hiyouga@users.noreply.huggingface.co>
- README.md +28 -14
- tokenizer_config.json +1 -1
README.md
CHANGED
@@ -63,7 +63,7 @@ Training details:
|
|
63 |
- optimizer: paged_adamw_32bit
|
64 |
|
65 |
<details>
|
66 |
-
<summary>To reproduce the model</summary>
|
67 |
|
68 |
To reproduce Llama3-8B-Chinese-Chat-**v2** (to reproduce Llama3-8B-Chinese-Chat-**v1**, please refer to [this link](https://huggingface.co/shenzhi-wang/Llama3-8B-Chinese-Chat/blob/v1/README.md#1-introduction)):
|
69 |
|
@@ -141,13 +141,12 @@ print(tokenizer.decode(response, skip_special_tokens=True))
|
|
141 |
|
142 |
# 3. Examples
|
143 |
|
144 |
-
The following are some examples generated by Llama3-8B-Chinese-Chat-**v2**, including examples of
|
145 |
|
146 |
For the examples generated by Llama3-8B-Chinese-Chat-**v1**, please refer to [this link](https://huggingface.co/shenzhi-wang/Llama3-8B-Chinese-Chat/blob/v1/README.md#3-examples).
|
147 |
|
148 |
-
|
149 |
-
|
150 |
-
## 🌟 3.1 Roleplay:
|
151 |
|
152 |
**User:**
|
153 |
|
@@ -367,7 +366,10 @@ Llama3-8B-Chinese-Chat, doth show us the map.
|
|
367 |
|
368 |
---
|
369 |
|
370 |
-
|
|
|
|
|
|
|
371 |
|
372 |
**User:**
|
373 |
|
@@ -483,9 +485,10 @@ Action:
|
|
483 |
]
|
484 |
```
|
485 |
|
486 |
-
|
487 |
|
488 |
-
|
|
|
489 |
|
490 |
**User:**
|
491 |
|
@@ -563,7 +566,10 @@ A / B = C 且没有余数,即 A = B * C。
|
|
563 |
|
564 |
---
|
565 |
|
566 |
-
|
|
|
|
|
|
|
567 |
|
568 |
**User:**
|
569 |
|
@@ -633,7 +639,10 @@ A / B = C 且没有余数,即 A = B * C。
|
|
633 |
|
634 |
---
|
635 |
|
636 |
-
|
|
|
|
|
|
|
637 |
|
638 |
**User:**
|
639 |
|
@@ -682,9 +691,11 @@ I want you to remember that same determination and focus when you're working on
|
|
682 |
|
683 |
Now, let's talk about something else. How have you been? What have you been up to lately?
|
684 |
|
685 |
-
---
|
686 |
|
687 |
-
|
|
|
|
|
|
|
688 |
|
689 |
**User:**
|
690 |
|
@@ -743,9 +754,10 @@ Now, let's talk about something else. How have you been? What have you been up t
|
|
743 |
总之,人工智能的发展是一个多维度的复杂过程,它不仅需要技术的创新,还需要社会、伦理和法律的支持。未来,人工智能将继续在我们的生活中扮演越来越重要的角色,我们需要积极准备,共同构建一个能够充分利用AI带来的好处,同时解决其可能带来的
|
744 |
问题的社会。
|
745 |
|
746 |
-
|
747 |
|
748 |
-
|
|
|
749 |
|
750 |
**User:**
|
751 |
|
@@ -978,3 +990,5 @@ int main() {
|
|
978 |
在`main`函数中,我们提供了一个例子,搜索文本`"ABC ABCDAB ABCDABCDABDE"`中的模式`"ABCDABD"`。如果找到匹配,输出将是`"Pattern found at index"`,随后是匹配的起始索引。如果没有找到匹配,输出将是`"Pattern not found"`。
|
979 |
|
980 |
请注意,这段代码假设输入文本和模式字符串只包含小写英文字母和中文字符。如果需要处理其他字符集,可能需要适当调整。
|
|
|
|
|
|
63 |
- optimizer: paged_adamw_32bit
|
64 |
|
65 |
<details>
|
66 |
+
<summary><b>To reproduce the model [CLICK TO EXPAND]</b></summary>
|
67 |
|
68 |
To reproduce Llama3-8B-Chinese-Chat-**v2** (to reproduce Llama3-8B-Chinese-Chat-**v1**, please refer to [this link](https://huggingface.co/shenzhi-wang/Llama3-8B-Chinese-Chat/blob/v1/README.md#1-introduction)):
|
69 |
|
|
|
141 |
|
142 |
# 3. Examples
|
143 |
|
144 |
+
The following are some examples generated by Llama3-8B-Chinese-Chat-**v2**, including examples of role playing, function calling, math, RuoZhiBa (弱智吧), safety, writing, and coding, etc.
|
145 |
|
146 |
For the examples generated by Llama3-8B-Chinese-Chat-**v1**, please refer to [this link](https://huggingface.co/shenzhi-wang/Llama3-8B-Chinese-Chat/blob/v1/README.md#3-examples).
|
147 |
|
148 |
+
<details>
|
149 |
+
<summary><b>Role playing [CLICK TO EXPAND]</b></summary>
|
|
|
150 |
|
151 |
**User:**
|
152 |
|
|
|
366 |
|
367 |
---
|
368 |
|
369 |
+
</details>
|
370 |
+
|
371 |
+
<details>
|
372 |
+
<summary><b>Function calling [CLICK TO EXPAND]</b></summary>
|
373 |
|
374 |
**User:**
|
375 |
|
|
|
485 |
]
|
486 |
```
|
487 |
|
488 |
+
</details>
|
489 |
|
490 |
+
<details>
|
491 |
+
<summary><b>Math [CLICK TO EXPAND]</b></summary>
|
492 |
|
493 |
**User:**
|
494 |
|
|
|
566 |
|
567 |
---
|
568 |
|
569 |
+
</details>
|
570 |
+
|
571 |
+
<details>
|
572 |
+
<summary><b>Ruozhiba (弱智吧) [CLICK TO EXPAND]</b></summary>
|
573 |
|
574 |
**User:**
|
575 |
|
|
|
639 |
|
640 |
---
|
641 |
|
642 |
+
</details>
|
643 |
+
|
644 |
+
<details>
|
645 |
+
<summary><b>Safety [CLICK TO EXPAND]</b></summary>
|
646 |
|
647 |
**User:**
|
648 |
|
|
|
691 |
|
692 |
Now, let's talk about something else. How have you been? What have you been up to lately?
|
693 |
|
|
|
694 |
|
695 |
+
</details>
|
696 |
+
|
697 |
+
<details>
|
698 |
+
<summary><b>Writing [CLICK TO EXPAND]</b></summary>
|
699 |
|
700 |
**User:**
|
701 |
|
|
|
754 |
总之,人工智能的发展是一个多维度的复杂过程,它不仅需要技术的创新,还需要社会、伦理和法律的支持。未来,人工智能将继续在我们的生活中扮演越来越重要的角色,我们需要积极准备,共同构建一个能够充分利用AI带来的好处,同时解决其可能带来的
|
755 |
问题的社会。
|
756 |
|
757 |
+
</details>
|
758 |
|
759 |
+
<details>
|
760 |
+
<summary><b>Coding [CLICK TO EXPAND]</b></summary>
|
761 |
|
762 |
**User:**
|
763 |
|
|
|
990 |
在`main`函数中,我们提供了一个例子,搜索文本`"ABC ABCDAB ABCDABCDABDE"`中的模式`"ABCDABD"`。如果找到匹配,输出将是`"Pattern found at index"`,随后是匹配的起始索引。如果没有找到匹配,输出将是`"Pattern not found"`。
|
991 |
|
992 |
请注意,这段代码假设输入文本和模式字符串只包含小写英文字母和中文字符。如果需要处理其他字符集,可能需要适当调整。
|
993 |
+
|
994 |
+
</details>
|
tokenizer_config.json
CHANGED
@@ -2050,7 +2050,7 @@
|
|
2050 |
}
|
2051 |
},
|
2052 |
"bos_token": "<|begin_of_text|>",
|
2053 |
-
"chat_template": "{{ '<|begin_of_text|>' }}{% set system_message = 'You are Llama3-8B-Chinese-Chat-v2,
|
2054 |
"clean_up_tokenization_spaces": true,
|
2055 |
"eos_token": "<|eot_id|>",
|
2056 |
"pad_token": "<|eot_id|>",
|
|
|
2050 |
}
|
2051 |
},
|
2052 |
"bos_token": "<|begin_of_text|>",
|
2053 |
+
"chat_template": "{{ '<|begin_of_text|>' }}{% set system_message = 'You are Llama3-8B-Chinese-Chat-v2, finetuned from Llama3-8B-Instruct on Chinese-English dataset using the ORPO algorithm. You are a helpful assistant.' %}{% if messages[0]['role'] == 'system' %}{% set system_message = messages[0]['content'] %}{% set loop_messages = messages[1:] %}{% else %}{% set loop_messages = messages %}{% endif %}{% if system_message is defined %}{{ '<|start_header_id|>system<|end_header_id|>\n\n' + system_message | trim + '<|eot_id|>' }}{% endif %}{% for message in loop_messages %}{{ '<|start_header_id|>' + message['role'] + '<|end_header_id|>\n\n'+ message['content'] | trim + '<|eot_id|>' }}{% endfor %}{% if add_generation_prompt %}{{ '<|start_header_id|>assistant<|end_header_id|>\n\n' }}{% endif %}",
|
2054 |
"clean_up_tokenization_spaces": true,
|
2055 |
"eos_token": "<|eot_id|>",
|
2056 |
"pad_token": "<|eot_id|>",
|