yangyx30678 commited on
Commit
bff58f2
1 Parent(s): 7af464c

update: PR dpo

Browse files
README.md CHANGED
@@ -1,70 +1,161 @@
1
  ---
2
- base_model: unsloth/llama-2-7b-bnb-4bit
 
 
3
  library_name: peft
4
  license: apache-2.0
5
- datasets: hermeschen1116/daily_dialog_for_RG
 
6
  tags:
7
  - trl
8
  - unsloth
9
- model-index:
10
- - name: response_generator_for_emotion_chat_bot
11
- results: []
12
  language:
13
  - en
14
  pipeline_tag: text-generation
15
- metrics:
16
- - accuracy
17
- - f1-score
18
- ---
19
-
20
- <!-- This model card has been generated automatically according to the information the Trainer had access to. You
21
- should probably proofread and complete it, then remove this comment. -->
22
 
 
23
  # Response Generator for [Emotion Chat Bot](https://github.com/hermeschen1116/chat-bot)
24
 
25
- This model is a fine-tuned version of [unsloth/llama-2-7b-bnb-4bit](https://huggingface.co/unsloth/llama-2-7b-bnb-4bit) on [hermeschen1116/daily_dialog_for_RG](https://huggingface.co/datasets/hermeschen1116/daily_dialog_for_RG), self modified version of [daily_dialog](li2017dailydialog/daily_dialog).
26
 
27
  ## Model description
28
 
29
- More information needed
30
 
31
  ## Intended uses & limitations
32
 
33
- More information needed
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
34
 
35
- ## Training and evaluation data
36
 
37
- More information needed
 
 
 
38
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
39
  ## Training procedure
40
 
41
  ### Training hyperparameters
42
 
43
  The following hyperparameters were used during training:
44
- - system_prompt: ""
45
- - learning_rate: 0.0002
46
- - weight_decay: 0.001
47
- - max_grad_norm: 0.3
48
- - warmup_ratio: 0.03
49
- - max_steps: -1
50
- - train_batch_size: 4
51
- - seed: 42
52
- - optimizer: paged_adamw_32bit with betas=(0.9,0.999) and epsilon=1e-08
53
- - lr_scheduler_type: constant
54
- - lr_scheduler_warmup_ratio: 0.03
55
- - num_epochs: 1
56
- - init_lora_weights: true
57
- - lora_rank: 16
58
- - lora_alpha: 16
59
- - lora_dropout: 0.1
60
- - use_rslora: true
61
 
62
  ### Framework versions
63
 
 
 
64
  - PEFT 0.11.1
65
- - Transformers 4.41.2
66
  - Pytorch 2.3.0+cu121
67
- - Datasets 2.20.0
68
  - Tokenizers 0.19.1
69
  - Trl 0.8.6
70
- - Bitsandbytes
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
  ---
2
+ base_model:
3
+ - unsloth/llama-2-7b-bnb-4bit
4
+ - hermeschen1116/response_generator_for_emotion_chat_bot
5
  library_name: peft
6
  license: apache-2.0
7
+ datasets:
8
+ - Shotaro30678/rlhf-RG-trl-style-v3
9
  tags:
10
  - trl
11
  - unsloth
 
 
 
12
  language:
13
  - en
14
  pipeline_tag: text-generation
 
 
 
 
 
 
 
15
 
16
+ ---
17
  # Response Generator for [Emotion Chat Bot](https://github.com/hermeschen1116/chat-bot)
18
 
 
19
 
20
  ## Model description
21
 
22
+ This model is a dpo fine-tuned version of [hermeschen1116/response_generator_for_emotion_chat_bot](https://huggingface.co/hermeschen1116/response_generator_for_emotion_chat_bot) on [Shotaro30678/rlhf-RG-trl-style-v3](https://huggingface.co/datasets/Shotaro30678/rlhf-RG-trl-style-v3), self modified version of [daily_dialog](li2017dailydialog/daily_dialog).
23
 
24
  ## Intended uses & limitations
25
 
26
+ Use dpo trainer to do the RLHF so that the model can be more precise and consistent.
27
+
28
+ ## Model performance
29
+
30
+ **Sentiment Score:**
31
+ **[Shotaro30678/emotion_text_classifier_on_dd_v1](https://huggingface.co/Shotaro30678/emotion_text_classifier_on_dd_v1)**
32
+
33
+ | **Metric** | **DPO Trained Model** | **SFT Model (Reference)** |
34
+ |--------------|:----------------------:|:--------------------------:|
35
+ | **Accuracy** | 0.851 | 0.788 |
36
+ | **F1-score** | 0.8564 | 0.7975 |
37
+
38
+ **Gibberish Distribution:**
39
+ **[madhurjindal/autonlp-Gibberish-Detector-492513457](https://huggingface.co/madhurjindal/autonlp-Gibberish-Detector-492513457)**
40
+
41
+ | **Category** | **DPO Trained Model** | **SFT Model (Reference)** |
42
+ |---------------------|:----------------------:|:--------------------------:|
43
+ | **Clean** | 882 | 898 |
44
+ | **Mild Gibberish** | 94 | 58 |
45
+ | **Word Salad** | 21 | 33 |
46
+ | **Noise** | 3 | 11 |
47
 
48
+ **Cut-Off Output:**
49
 
50
+ | **Output Type** | **DPO Trained Model** | **SFT Model (Reference)** |
51
+ |---------------------|:----------------------:|:--------------------------:|
52
+ | **Complete Output** | 985 | 975 |
53
+ | **Incomplete Output** | 15 | 25 |
54
 
55
+ on [hermeschen1116/daily_dialog_for_RG](https://huggingface.co/datasets/hermeschen1116/daily_dialog_for_RG) test split.
56
+
57
+ **test on config:**
58
+ ```python
59
+ generation_config = GenerationConfig(
60
+ max_new_tokens=150,
61
+ min_new_tokens=5,
62
+ repetition_penalty=1.1,
63
+ top_k=3,
64
+ top_p=0.9,
65
+ pad_token_id=tokenizer.pad_token_id,
66
+ eos_token_id=tokenizer.eos_token_id,
67
+ temperature=1.0,
68
+ do_sample=True,
69
+ num_beams=1
70
+ )
71
+ ```
72
  ## Training procedure
73
 
74
  ### Training hyperparameters
75
 
76
  The following hyperparameters were used during training:
77
+ - beta=0.1,
78
+ - remove_unused_columns=False,
79
+ - num_train_epochs=3,
80
+ - gradient_checkpointing=True
81
+
82
+ others remain default
 
 
 
 
 
 
 
 
 
 
 
83
 
84
  ### Framework versions
85
 
86
+ - Bitsandbytes 0.43.1
87
+ - Datasets 2.20.0
88
  - PEFT 0.11.1
 
89
  - Pytorch 2.3.0+cu121
90
+ - Transformers 4.42.4
91
  - Tokenizers 0.19.1
92
  - Trl 0.8.6
93
+ - unsloth 2024.7 0f2e484
94
+
95
+ # Uploaded model
96
+
97
+ - **Developed by:** Shotaro30678
98
+ - **Finetuned from model :** hermeschen1116/response_generator_for_emotion_chat_bot
99
+
100
+ This llama model was trained 2x faster with [Unsloth](https://github.com/unslothai/unsloth) and Huggingface's TRL library.
101
+
102
+ [<img src="https://raw.githubusercontent.com/unslothai/unsloth/main/images/unsloth%20made%20with%20love.png" width="200"/>](https://github.com/unslothai/unsloth)
103
+
104
+ # Quick sample
105
+ ```python
106
+ # libs are from github repo
107
+ from libs import ResponseGeneratorPipeline
108
+ from unsloth import FastLanguageModel
109
+ model, tokenizer = FastLanguageModel.from_pretrained(
110
+ model_name = "Shotaro30678/response_generator_DPO", # YOUR MODEL YOU USED FOR TRAINING
111
+ load_in_4bit = True,
112
+ )
113
+ FastLanguageModel.for_inference(model) # Enable native 2x faster inference
114
+
115
+ bot = ResponseGeneratorPipeline(
116
+ model,
117
+ tokenizer,
118
+ framework="pt",
119
+ task="conversation-generation",
120
+ num_workers=16,
121
+ torch_dtype="auto",
122
+ add_special_tokens=True,
123
+ truncation=False,
124
+ padding=True
125
+ )
126
+
127
+ conversation = [
128
+ {'content': {'dialog': '', 'emotion': ''}, 'role': 'system'},
129
+ {'content': {'dialog': 'Can you do push-ups ?', 'emotion': 'neutral'},
130
+ 'role': 'user'},
131
+ {'content': {'dialog': "Of course I can . It's a piece of cake ! Believe it or not , I can do 30 push-ups a minute .",
132
+ 'emotion': 'neutral'},
133
+ 'role': 'assistant'},
134
+ {'content': {'dialog': "Really ? I think that's impossible !",
135
+ 'emotion': 'surprise'},
136
+ 'role': 'user'},
137
+ {'content': {'dialog': 'You mean 30 push-ups ?', 'emotion': 'neutral'},
138
+ 'role': 'assistant'},
139
+ {'content': {'dialog': 'Yeah !', 'emotion': 'neutral'}, 'role': 'user'},
140
+ {'content': {'dialog': '', 'emotion': 'neutral'}, 'role': 'assistant'}
141
+ ]
142
+
143
+ generation_config = GenerationConfig(
144
+ max_new_tokens=150,
145
+ min_new_tokens=5,
146
+ repetition_penalty=1.1,
147
+ top_k=3,
148
+ top_p=0.9,
149
+ pad_token_id=tokenizer.pad_token_id,
150
+ eos_token_id=tokenizer.eos_token_id,
151
+ temperature=1.0,
152
+ do_sample=True,
153
+ num_beams=1
154
+ )
155
+
156
+ print(bot(conversation, generation_config=generation_config)[0]['generated_text'][-1]["content"]["dialog"])
157
+ ```
158
+ **output:**
159
+ ```
160
+ 30 push-ups in a row?
161
+ ```
adapter_config.json DELETED
@@ -1,37 +0,0 @@
1
- {
2
- "alpha_pattern": {},
3
- "auto_mapping": null,
4
- "base_model_name_or_path": "unsloth/llama-2-7b-bnb-4bit",
5
- "bias": "none",
6
- "fan_in_fan_out": false,
7
- "inference_mode": true,
8
- "init_lora_weights": true,
9
- "layer_replication": null,
10
- "layers_pattern": null,
11
- "layers_to_transform": null,
12
- "loftq_config": {},
13
- "lora_alpha": 16,
14
- "lora_dropout": 0.1,
15
- "megatron_config": null,
16
- "megatron_core": "megatron.core",
17
- "modules_to_save": [
18
- "lm_head",
19
- "embed_tokens"
20
- ],
21
- "peft_type": "LORA",
22
- "r": 8,
23
- "rank_pattern": {},
24
- "revision": "unsloth",
25
- "target_modules": [
26
- "up_proj",
27
- "v_proj",
28
- "o_proj",
29
- "down_proj",
30
- "k_proj",
31
- "q_proj",
32
- "gate_proj"
33
- ],
34
- "task_type": "CAUSAL_LM",
35
- "use_dora": false,
36
- "use_rslora": true
37
- }
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
config.json ADDED
@@ -0,0 +1,43 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "_name_or_path": "16bit_model_3epo-v3",
3
+ "architectures": [
4
+ "LlamaForCausalLM"
5
+ ],
6
+ "attention_bias": false,
7
+ "attention_dropout": 0.0,
8
+ "bos_token_id": 1,
9
+ "eos_token_id": 2,
10
+ "hidden_act": "silu",
11
+ "hidden_size": 4096,
12
+ "initializer_range": 0.02,
13
+ "intermediate_size": 11008,
14
+ "max_position_embeddings": 4096,
15
+ "mlp_bias": false,
16
+ "model_type": "llama",
17
+ "num_attention_heads": 32,
18
+ "num_hidden_layers": 32,
19
+ "num_key_value_heads": 32,
20
+ "pad_token_id": 0,
21
+ "pretraining_tp": 1,
22
+ "quantization_config": {
23
+ "bnb_4bit_compute_dtype": "bfloat16",
24
+ "bnb_4bit_quant_type": "nf4",
25
+ "bnb_4bit_use_double_quant": true,
26
+ "llm_int8_enable_fp32_cpu_offload": false,
27
+ "llm_int8_has_fp16_weight": false,
28
+ "llm_int8_skip_modules": null,
29
+ "llm_int8_threshold": 6.0,
30
+ "load_in_4bit": true,
31
+ "load_in_8bit": false,
32
+ "quant_method": "bitsandbytes"
33
+ },
34
+ "rms_norm_eps": 1e-05,
35
+ "rope_scaling": null,
36
+ "rope_theta": 10000.0,
37
+ "tie_word_embeddings": false,
38
+ "torch_dtype": "bfloat16",
39
+ "transformers_version": "4.42.4",
40
+ "unsloth_version": "2024.7",
41
+ "use_cache": false,
42
+ "vocab_size": 32005
43
+ }
generation_config.json ADDED
@@ -0,0 +1,10 @@
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "bos_token_id": 1,
3
+ "do_sample": true,
4
+ "eos_token_id": 2,
5
+ "max_length": 4096,
6
+ "pad_token_id": 0,
7
+ "temperature": 0.6,
8
+ "top_p": 0.9,
9
+ "transformers_version": "4.42.4"
10
+ }
adapter_model.safetensors → model.safetensors RENAMED
@@ -1,3 +1,3 @@
1
  version https://git-lfs.github.com/spec/v1
2
- oid sha256:e33740cf840bcfdb116fbeca8b9356c6534619a884b57c85950643e7865e95d9
3
- size 1653123632
 
1
  version https://git-lfs.github.com/spec/v1
2
+ oid sha256:3dffd8b9dc73074d84ce714f560c842b88f3622ebca65c31534bf38c1304cd86
3
+ size 3866124296
special_tokens_map.json CHANGED
@@ -1,33 +1,9 @@
1
  {
2
  "additional_special_tokens": [
3
- {
4
- "content": "[INST]",
5
- "lstrip": false,
6
- "normalized": false,
7
- "rstrip": false,
8
- "single_word": false
9
- },
10
- {
11
- "content": "[/INST]",
12
- "lstrip": false,
13
- "normalized": false,
14
- "rstrip": false,
15
- "single_word": false
16
- },
17
- {
18
- "content": "[EMOTION]",
19
- "lstrip": false,
20
- "normalized": false,
21
- "rstrip": false,
22
- "single_word": false
23
- },
24
- {
25
- "content": "[/EMOTION]",
26
- "lstrip": false,
27
- "normalized": false,
28
- "rstrip": false,
29
- "single_word": false
30
- }
31
  ],
32
  "bos_token": {
33
  "content": "<s>",
 
1
  {
2
  "additional_special_tokens": [
3
+ "[INST]",
4
+ "[/INST]",
5
+ "[EMOTION]",
6
+ "[/EMOTION]"
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
7
  ],
8
  "bos_token": {
9
  "content": "<s>",
tokenizer_config.json CHANGED
@@ -1,6 +1,7 @@
1
  {
2
  "add_bos_token": true,
3
  "add_eos_token": false,
 
4
  "added_tokens_decoder": {
5
  "0": {
6
  "content": "<unk>",
@@ -80,7 +81,7 @@
80
  "legacy": false,
81
  "model_max_length": 4096,
82
  "pad_token": "<pad>",
83
- "padding_side": "right",
84
  "sp_model_kwargs": {},
85
  "tokenizer_class": "LlamaTokenizer",
86
  "unk_token": "<unk>",
 
1
  {
2
  "add_bos_token": true,
3
  "add_eos_token": false,
4
+ "add_prefix_space": null,
5
  "added_tokens_decoder": {
6
  "0": {
7
  "content": "<unk>",
 
81
  "legacy": false,
82
  "model_max_length": 4096,
83
  "pad_token": "<pad>",
84
+ "padding_side": "left",
85
  "sp_model_kwargs": {},
86
  "tokenizer_class": "LlamaTokenizer",
87
  "unk_token": "<unk>",
training_args.bin DELETED
@@ -1,3 +0,0 @@
1
- version https://git-lfs.github.com/spec/v1
2
- oid sha256:1c7df05d620f636d89b16a79736c9367ba18b1a8b0a255003afdef713dccd26d
3
- size 5176