taesunwhang commited on
Commit
d788394
1 Parent(s): 4c10cab

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +38 -3
README.md CHANGED
@@ -1,3 +1,38 @@
1
- ---
2
- license: llama2
3
- ---
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ base_model:
3
+ - lmsys/vicuna-7b-v1.5
4
+ - meta-math/MetaMath-Llemma-7B
5
+ library_name: transformers
6
+ tags:
7
+ - mergekit
8
+ - merge
9
+ license: llama2
10
+ ---
11
+ # merge
12
+
13
+ This is a merge of pre-trained language models created using [mergekit](https://github.com/cg123/mergekit).
14
+ Model merge (slerp) based on [lmsys/vicuna-7b-v1.5](https://huggingface.co/lmsys/vicuna-7b-v1.5) and [meta-math/MetaMath-Llemma-7B](https://huggingface.co/meta-math/MetaMath-Llemma-7B)
15
+
16
+ 1. Vicuna
17
+
18
+ ## Model Details
19
+
20
+ Vicuna is a chat assistant trained by fine-tuning Llama 2 on user-shared conversations collected from ShareGPT.
21
+
22
+ - **Developed by:** [LMSYS](https://lmsys.org/)
23
+ - **Model type:** An auto-regressive language model based on the transformer architecture
24
+ - **License:** Llama 2 Community License Agreement
25
+ - **Finetuned from model:** [Llama 2](https://arxiv.org/abs/2307.09288)
26
+
27
+ ### Model Sources
28
+
29
+ - **Repository:** https://github.com/lm-sys/FastChat
30
+ - **Blog:** https://lmsys.org/blog/2023-03-30-vicuna/
31
+ - **Paper:** https://arxiv.org/abs/2306.05685
32
+ - **Demo:** https://chat.lmsys.org/
33
+
34
+ 2. MetaMath Llemma
35
+
36
+ ## Model Details
37
+
38
+ MetaMath-Llemma-7B is fully fine-tuned on the MetaMathQA datasets and based on the powerful Llemma-7B model. It is glad to see using MetaMathQA datasets and change the base model from llama-2-7B to Llemma-7B can boost the MATH performance from 19.8 to **30.0**.