Weyaxi commited on
Commit
66dae63
1 Parent(s): e6cea72

Create README.md

Browse files
Files changed (1) hide show
  1. README.md +63 -0
README.md ADDED
@@ -0,0 +1,63 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ license: apache-2.0
3
+ datasets:
4
+ - Intel/orca_dpo_pairs
5
+ tags:
6
+ - mistral
7
+ - dpo
8
+ - una
9
+ - finetune
10
+ - chatml
11
+ - instruct
12
+ ---
13
+ ![image/png](https://cdn-uploads.huggingface.co/production/uploads/6468ce47e134d050a58aa89c/-BlRd-74Hk4B153wl_ryD.png)
14
+
15
+ # Neural-una-cybertron-7b
16
+
17
+ Neural-una-cybertron-7b is an [fblgit/una-cybertron-7b-v2-bf16](https://huggingface.co/fblgit/una-cybertron-7b-v2-bf16) model that has been further fine-tuned with Direct Preference Optimization (DPO) using the [Intel/orca_dpo_pairs](https://huggingface.co/datasets/Intel/orca_dpo_pairs) dataset.
18
+
19
+ This model was created after examining the procedure of [mlabonne/NeuralHermes-2.5-Mistral-7B](https://hf.co/mlabonne/NeuralHermes-2.5-Mistral-7B) model. Special thanks to [@mlabonne](https://hf.co/mlabonne).
20
+
21
+ ## Addionatal Information
22
+
23
+ This model was fine-tuned on `Nvidia A100-SXM4-40GB` GPU.
24
+
25
+ The total training time was 1 hour and 10 minutes.
26
+
27
+ # Prompt Template(s)
28
+
29
+ ### ChatML
30
+
31
+ ```
32
+ <|im_start|>system
33
+ {system}<|im_end|>
34
+ <|im_start|>user
35
+ {user}<|im_end|>
36
+ <|im_start|>assistant
37
+ {asistant}<|im_end|>
38
+ ```
39
+
40
+ ## Training hyperparameters
41
+
42
+ **LoRA**:
43
+ * r=16
44
+ * lora_alpha=16
45
+ * lora_dropout=0.05
46
+ * bias="none"
47
+ * task_type="CAUSAL_LM"
48
+ * target_modules=['k_proj', 'gate_proj', 'v_proj', 'up_proj', 'q_proj', 'o_proj', 'down_proj']
49
+
50
+ **Training arguments**:
51
+ * per_device_train_batch_size=4
52
+ * gradient_accumulation_steps=4
53
+ * gradient_checkpointing=True
54
+ * learning_rate=5e-5
55
+ * lr_scheduler_type="cosine"
56
+ * max_steps=200
57
+ * optim="paged_adamw_32bit"
58
+ * warmup_steps=100
59
+
60
+ **DPOTrainer**:
61
+ * beta=0.1
62
+ * max_prompt_length=1024
63
+ * max_length=1536