afaji commited on
Commit
ed647a1
1 Parent(s): 809182d

auto-rename

Browse files
Files changed (1) hide show
  1. README.md +8 -8
README.md CHANGED
@@ -19,21 +19,21 @@ widget:
19
  should probably proofread and complete it, then remove this comment. -->
20
 
21
  <p align="center" width="100%">
22
- <a><img src="https://raw.githubusercontent.com/mbzuai-nlp/lamini/main/images/LaMnin.png" alt="Title" style="width: 100%; min-width: 300px; display: block; margin: auto;"></a>
23
  </p>
24
 
25
  # LaMini-Flan-T5-783M
26
 
27
  [![Model License](https://img.shields.io/badge/Model%20License-CC%20By%20NC%204.0-red.svg)]()
28
 
29
- This model is one of our LaMini model series in paper "[LaMini: A Diverse Herd of Distilled Models from Large-Scale Instructions](https://github.com/mbzuai-nlp/lamini)". This model is a fine-tuned version of [google/flan-t5-large](https://huggingface.co/google/flan-t5-large) on [LaMini dataset](https://huggingface.co/datasets/MBZUAI/LaMini-instruction) that contains 2.58M samples for instruction fine-tuning. For more information about our dataset, please refer to our [project repository](https://github.com/mbzuai-nlp/lamini/).
30
- You can view other LaMini model series as follow. Note that not all models are performing as well. Models with ✩ are those with the best overall performance given their size/architecture. More details can be seen in our paper.
31
 
32
  <table>
33
  <thead>
34
  <tr>
35
  <th>Base model</th>
36
- <th colspan="4">LaMini series (#parameters)</th>
37
  </tr>
38
  </thead>
39
  <tbody>
@@ -110,10 +110,10 @@ print("Response": generated_text)
110
  ## Training Procedure
111
 
112
  <p align="center" width="100%">
113
- <a><img src="https://raw.githubusercontent.com/mbzuai-nlp/lamini/main/images/lamini-pipeline.drawio.png" alt="Title" style="width: 100%; min-width: 250px; display: block; margin: auto;"></a>
114
  </p>
115
 
116
- We initialize with [google/flan-t5-large](https://huggingface.co/google/flan-t5-large) and fine-tune it on our [LaMini dataset](https://huggingface.co/datasets/MBZUAI/LaMini-instruction). Its total number of parameters is 783M.
117
 
118
  ### Training Hyperparameters
119
 
@@ -140,8 +140,8 @@ More information needed
140
 
141
  ```bibtex
142
  @misc{lamini,
143
- title={LaMini: A Diverse Herd of Distilled Models from Large-Scale Instructions},
144
- author={},
145
  year={2023},
146
  publisher = {GitHub},
147
  journal = {GitHub repository},
 
19
  should probably proofread and complete it, then remove this comment. -->
20
 
21
  <p align="center" width="100%">
22
+ <a><img src="https://raw.githubusercontent.com/mbzuai-nlp/lamini-lm/main/images/lamini.png" alt="Title" style="width: 100%; min-width: 300px; display: block; margin: auto;"></a>
23
  </p>
24
 
25
  # LaMini-Flan-T5-783M
26
 
27
  [![Model License](https://img.shields.io/badge/Model%20License-CC%20By%20NC%204.0-red.svg)]()
28
 
29
+ This model is one of our LaMini-LM model series in paper "[LaMini-LM: A Diverse Herd of Distilled Models from Large-Scale Instructions](https://github.com/mbzuai-nlp/lamini-lm)". This model is a fine-tuned version of [google/flan-t5-large](https://huggingface.co/google/flan-t5-large) on [LaMini-instruction dataset](https://huggingface.co/datasets/MBZUAI/LaMini-instruction) that contains 2.58M samples for instruction fine-tuning. For more information about our dataset, please refer to our [project repository](https://github.com/mbzuai-nlp/lamini-lm/).
30
+ You can view other LaMini-LM model series as follow. Note that not all models are performing as well. Models with ✩ are those with the best overall performance given their size/architecture. More details can be seen in our paper.
31
 
32
  <table>
33
  <thead>
34
  <tr>
35
  <th>Base model</th>
36
+ <th colspan="4">LaMini-LM series (#parameters)</th>
37
  </tr>
38
  </thead>
39
  <tbody>
 
110
  ## Training Procedure
111
 
112
  <p align="center" width="100%">
113
+ <a><img src="https://raw.githubusercontent.com/mbzuai-nlp/lamini-lm/main/images/lamini-pipeline.drawio.png" alt="Title" style="width: 100%; min-width: 250px; display: block; margin: auto;"></a>
114
  </p>
115
 
116
+ We initialize with [google/flan-t5-large](https://huggingface.co/google/flan-t5-large) and fine-tune it on our [LaMini-instruction dataset](https://huggingface.co/datasets/MBZUAI/LaMini-instruction). Its total number of parameters is 783M.
117
 
118
  ### Training Hyperparameters
119
 
 
140
 
141
  ```bibtex
142
  @misc{lamini,
143
+ title={LaMini-LM: A Diverse Herd of Distilled Models from Large-Scale Instructions},
144
+ author={Minghao Wu and Abdul Waheed and Chiyu Zhang and Muhammad Abdul-Mageed and Alham Fikri Aji},
145
  year={2023},
146
  publisher = {GitHub},
147
  journal = {GitHub repository},