bongchoi commited on
Commit
e629343
1 Parent(s): f23050c

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +42 -0
README.md CHANGED
@@ -1,3 +1,45 @@
1
  ---
2
  license: mit
 
 
3
  ---
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
  ---
2
  license: mit
3
+ language:
4
+ - en
5
  ---
6
+ # **Introduction**
7
+ MoMo-70B-lora-1.8.7-DPO is trained via Direct Preference Optimization([DPO](https://arxiv.org/abs/2305.18290)) from [MoMo-70B-LoRA-V1.4](https://huggingface.co/moreh/MoMo-70B-LoRA-V1.4) as its base model, with several optimizations in hyperparameters.
8
+ [MoMo-70B-LoRA-V1.4](https://huggingface.co/moreh/MoMo-70B-LoRA-V1.4) is trained via Supervised Fine-Tuning (SFT) using [LoRA](https://arxiv.org/abs/2106.09685), with the QWEN-72B model as its base-model.
9
+ Note that we did not exploit any form of weight merge.
10
+ For leaderboard submission, the trained weight is realigned for compatibility with llama.
11
+ MoMo-70B is trained using **[Moreh](https://moreh.io/)**'s [MoAI platform](https://moreh.io/product), which simplifies the training of large-scale models, and AMD's MI250 GPU.
12
+
13
+
14
+ ## Details
15
+ ### Used Librarys
16
+ - torch
17
+ - peft
18
+ ### Used Datasets
19
+ - [slimorca](Open-Orca/SlimOrca)
20
+ - [truthy](https://huggingface.co/datasets/jondurbin/truthy-dpo-v0.1)
21
+ - [orca_dpo_pairs](https://huggingface.co/datasets/Intel/orca_dpo_pairs)
22
+ - No other dataset was used
23
+ - No benchmark test set or the training set are used
24
+ - [data contamination check](https://github.com/swj0419/detect-pretrain-code-contamination) result
25
+
26
+ | Model | ARC | MMLU | TruthfulQA | GSM8K |
27
+ |------------------------------|-------|-------|-------|-------|
28
+ | **V1.8.6(result < 0.1, %)**| TBU |TBU | TBU | TBU |
29
+ ### Used Environments
30
+ - AMD MI250 & MoAI platform
31
+ - Please visit https://moreh.io/product for more information about MoAI platform
32
+ - Or, contact us directly [contact@moreh.io](mailto:contact@moreh.io)
33
+
34
+ ## How to use
35
+
36
+ ```python
37
+ # pip install transformers==4.35.2
38
+ import torch
39
+ from transformers import AutoModelForCausalLM, AutoTokenizer
40
+
41
+ tokenizer = AutoTokenizer.from_pretrained("moreh/MoMo-70B-lora-1.8.7-DPO")
42
+ model = AutoModelForCausalLM.from_pretrained(
43
+ "moreh/MoMo-70B-lora-1.8.6-DPO"
44
+ )
45
+ ```