MBZUAI
/

LaMini-Flan-T5-77M

Text2Text Generation

Generated from Trainer

instruction fine-tuning

text-generation-inference

Inference Endpoints

Model card Files Files and versions Community

chiyuzhang commited on Apr 24, 2023

Commit

944519f

•

1 Parent(s): 7256c50

Update README.md

Files changed (1) hide show

README.md +82 -0

README.md CHANGED Viewed

@@ -41,6 +41,88 @@ The following hyperparameters were used during training:
 ## Training and evaluation data
 We conducted two sets of evaluations: automatic evaluation on downstream NLP tasks and human evaluation on user-oriented instructions. For more detail, please refer to our [paper]().
 ## Use

 ## Training and evaluation data
 We conducted two sets of evaluations: automatic evaluation on downstream NLP tasks and human evaluation on user-oriented instructions. For more detail, please refer to our [paper]().
+## Model Models
+You can download LaMini model series as follow. Note that not all models are performing as well. Models with ✩ are those with the best overall performance given their size/architecture. More details can be seen in our paper.
+<table>
+    <caption>
+    LaMini Language Models collection.
+  </caption>
+  <thead>
+    <tr>
+      <th>Name</th>
+      <th>Architecture</th>
+      <th>Initialization</th>
+    </tr>
+  </thead>
+  <tbody>
+    <tr>
+      <td>LaMini-T5-61M</td>
+      <td>encoder-decoder</td>
+      <td>T5-small</td>
+    </tr>
+    <tr>
+      <td>LaMini-T5-223M</td>
+      <td>encoder-decoder</td>
+      <td>T5-base</td>
+    </tr>
+    <tr>
+      <td>LaMini-T5-738M</td>
+      <td>encoder-decoder</td>
+      <td>T5-large</td>
+    </tr>
+    <tr>
+      <td>LaMini-Flan-T5-77M</td>
+      <td>encoder-decoder</td>
+      <td>Flan-T5-small</td>
+    </tr>
+    <tr>
+      <td>LaMini-Flan-T5-248M</td>
+      <td>encoder-decoder</td>
+      <td>Flan-T5-base</td>
+    </tr>
+    <tr>
+      <td>LaMini-Flan-T5-783M</td>
+      <td>encoder-decoder</td>
+      <td>Flan-T5-large</td>
+    </tr>
+    <tr>
+      <td>LaMini-Cb-111M</td>
+      <td>decoder-only</td>
+      <td>Cerebras-GPT-111M</td>
+    </tr>
+    <tr>
+      <td>LaMini-Cb-256M</td>
+      <td>decoder-only</td>
+      <td>Cerebras-GPT-256M</td>
+    </tr>
+    <tr>
+      <td>LaMini-Cb-590M</td>
+      <td>decoder-only</td>
+      <td>Cerebras-GPT-590M</td>
+    </tr>
+    <tr>
+      <td>LaMini-Cb-1.3B</td>
+      <td>decoder-only</td>
+      <td>Cerebras-GPT-1.3B</td>
+    </tr>
+    <tr>
+      <td>LaMini-GPT-124M</td>
+      <td>decoder-only</td>
+      <td>GPT-2</td>
+    </tr>
+    <tr>
+      <td>LaMini-GPT-774M</td>
+      <td>decoder-only</td>
+      <td>GPT-2 large</td>
+    </tr>
+    <tr>
+      <td>LaMini-GPT-1.5B</td>
+      <td>decoder-only</td>
+      <td>GPT-2 xl</td>
+    </tr>
+  </tbody>
+</table>
 ## Use