pkupie
/

Llama-2-7b-FLAN-step884

Model card Files Files and versions Community

KobayashiKanna01 commited on Dec 16, 2024

Commit

6f63885

·

verified ·

1 Parent(s): 5f32f72

Update README.md

Files changed (1) hide show

README.md +45 -3

README.md CHANGED Viewed

@@ -1,3 +1,45 @@
----
-license: llama2
----

+---
+license: llama2
+language:
+- en
+- bo
+base_model:
+- meta-llama/Llama-2-7b-hf
+---
+A supervised fine-tuned model based on Llama-2-7b-hf.
+We use the FLAN datasets for training.
+#### Hyper-parameters:
+ * lr: 3e-5
+ * batch size: 0.25M (2K*128)
+ * lr scheduler: cosine
+ * min lr: 1e-5
+ * lr decay iters: 2048
+## Citation
+If you find this model is useful in your work, please cite it with:
+```
+@inproceedings{tao-etal-2024-unlocking,
+    title = "Unlocking the Potential of Model Merging for Low-Resource Languages",
+    author = "Tao, Mingxu  and
+      Zhang, Chen  and
+      Huang, Quzhe  and
+      Ma, Tianyao  and
+      Huang, Songfang  and
+      Zhao, Dongyan  and
+      Feng, Yansong",
+    editor = "Al-Onaizan, Yaser  and
+      Bansal, Mohit  and
+      Chen, Yun-Nung",
+    booktitle = "Findings of the Association for Computational Linguistics: EMNLP 2024",
+    month = nov,
+    year = "2024",
+    address = "Miami, Florida, USA",
+    publisher = "Association for Computational Linguistics",
+    url = "https://aclanthology.org/2024.findings-emnlp.508",
+    doi = "10.18653/v1/2024.findings-emnlp.508",
+    pages = "8705--8720"
+}
+```