KobayashiKanna01 commited on
Commit
6f63885
1 Parent(s): 5f32f72

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +45 -3
README.md CHANGED
@@ -1,3 +1,45 @@
1
- ---
2
- license: llama2
3
- ---
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ license: llama2
3
+ language:
4
+ - en
5
+ - bo
6
+ base_model:
7
+ - meta-llama/Llama-2-7b-hf
8
+ ---
9
+
10
+ A supervised fine-tuned model based on Llama-2-7b-hf.
11
+
12
+ We use the FLAN datasets for training.
13
+
14
+ #### Hyper-parameters:
15
+ * lr: 3e-5
16
+ * batch size: 0.25M (2K*128)
17
+ * lr scheduler: cosine
18
+ * min lr: 1e-5
19
+ * lr decay iters: 2048
20
+
21
+ ## Citation
22
+ If you find this model is useful in your work, please cite it with:
23
+ ```
24
+ @inproceedings{tao-etal-2024-unlocking,
25
+ title = "Unlocking the Potential of Model Merging for Low-Resource Languages",
26
+ author = "Tao, Mingxu and
27
+ Zhang, Chen and
28
+ Huang, Quzhe and
29
+ Ma, Tianyao and
30
+ Huang, Songfang and
31
+ Zhao, Dongyan and
32
+ Feng, Yansong",
33
+ editor = "Al-Onaizan, Yaser and
34
+ Bansal, Mohit and
35
+ Chen, Yun-Nung",
36
+ booktitle = "Findings of the Association for Computational Linguistics: EMNLP 2024",
37
+ month = nov,
38
+ year = "2024",
39
+ address = "Miami, Florida, USA",
40
+ publisher = "Association for Computational Linguistics",
41
+ url = "https://aclanthology.org/2024.findings-emnlp.508",
42
+ doi = "10.18653/v1/2024.findings-emnlp.508",
43
+ pages = "8705--8720"
44
+ }
45
+ ```