samta-kamboj
commited on
Commit
•
89de3a2
1
Parent(s):
3142fe5
Update README.md
Browse files
README.md
CHANGED
@@ -27,7 +27,7 @@ We hope this extensive release will accelerate research in Arabic NLP, and enabl
|
|
27 |
|
28 |
## Jais Family Details
|
29 |
|
30 |
-
- **Developed by:**
|
31 |
- **Language(s):** (NLP): Arabic (MSA) and English.
|
32 |
- **Input:** Text only data.
|
33 |
- **Output:** Model generates text.
|
@@ -37,19 +37,19 @@ We hope this extensive release will accelerate research in Arabic NLP, and enabl
|
|
37 |
|
38 |
| **Pre-trained Model** | **Fine-tuned Model** | **Size (Parameters)** | **Context length (Tokens)** |
|
39 |
|:---------------------|:--------|:-------|:-------|
|
40 |
-
| [jais-family-30b-16k](https://huggingface.co/
|
41 |
-
| [jais-family-30b-8k](https://huggingface.co/
|
42 |
-
| [jais-family-13b ](https://huggingface.co/
|
43 |
-
| [jais-family-6p7b](https://huggingface.co/
|
44 |
-
| [jais-family-2p7b](https://huggingface.co/
|
45 |
-
| [jais-family-1p3b](https://huggingface.co/
|
46 |
-
| [jais-family-590m](https://huggingface.co/
|
47 |
|
48 |
| **Adapted pre-trained Model** | **Fine-tuned Model** | **Size (Parameters)** | **Context length (Tokens)** |
|
49 |
|:---------------------|:--------|:-------|:-------|
|
50 |
-
| [jais-adapted-70b](https://huggingface.co/
|
51 |
-
| [jais-adapted-13b](https://huggingface.co/
|
52 |
-
| [jais-adapted-7b](https://huggingface.co/
|
53 |
|
54 |
### Model Architecture:
|
55 |
<a name="model-architecture"></a>
|
@@ -72,7 +72,7 @@ Below is sample code to use the model. Note that the model requires a custom mod
|
|
72 |
import torch
|
73 |
from transformers import AutoTokenizer, AutoModelForCausalLM
|
74 |
|
75 |
-
model_path = "
|
76 |
|
77 |
device = "cuda" if torch.cuda.is_available() else "cpu"
|
78 |
|
@@ -130,16 +130,16 @@ We extensively preprocess and deduplicate the training data. For Arabic, we used
|
|
130 |
|
131 |
| **Pre-trained model** | **English data (tokens)** | **Arabic data (tokens)** | **Code data (tokens)** | **Total data (tokens)** |
|
132 |
|-------------------------|---------------------------|--------------------------|------------------------|------------------------|
|
133 |
-
| [jais-family-30b-16k](https://huggingface.co/
|
134 |
-
| [jais-family-30b-8k](https://huggingface.co/
|
135 |
-
| [jais-family-13b ](https://huggingface.co/
|
136 |
-
| [jais-family-6p7b](https://huggingface.co/
|
137 |
-
| [jais-family-2p7b](https://huggingface.co/
|
138 |
-
| [jais-family-1p3b](https://huggingface.co/
|
139 |
-
| [jais-family-590m](https://huggingface.co/
|
140 |
-
| [jais-adapted-70b](https://huggingface.co/
|
141 |
-
| [jais-adapted-13b](https://huggingface.co/
|
142 |
-
| [jais-adapted-7b](https://huggingface.co/
|
143 |
|
144 |
### Finetuning data
|
145 |
|
@@ -274,14 +274,14 @@ English prompts were translated to Arabic by our in-house linguists.
|
|
274 |
In the following, we compare the models in this release of the jais family against previously released versions:
|
275 |
|
276 |
<p align="center">
|
277 |
-
<img src="https://huggingface.co/
|
278 |
</p>
|
279 |
<p align="center">
|
280 |
<em>GPT-4-as-a-judge evaluation of Jais in Arabic and English. Jais family models are significantly better than previous Jais at generations in both languages. </em>
|
281 |
</p>
|
282 |
|
283 |
<p align="center">
|
284 |
-
<img src="https://huggingface.co/
|
285 |
</p>
|
286 |
<p align="center">
|
287 |
<em>GPT-4-as-a-judge evaluation of adapted Jais in Arabic and English. The generation quality of Arabic is significantly enhanced, while achieving improvement in English when compared to Llama-2 instruct. </em>
|
@@ -290,7 +290,7 @@ In the following, we compare the models in this release of the jais family again
|
|
290 |
Besides pairwise comparison, we also perform MT-bench style single-answer grading on a scale of 1 to 10.
|
291 |
|
292 |
<p align="center">
|
293 |
-
<img src="https://huggingface.co/
|
294 |
</p>
|
295 |
<p align="center">
|
296 |
<em>MT-bench style single-answer grading evaluation of Jais and adapted Jais in Arabic and English. Comparisons are made between select corresponding models from earlier releases. The quality ratings of responses are generally improved, with significant enhancements in Arabic.</em>
|
@@ -341,7 +341,7 @@ The following are some example scenarios where the model should not be used.
|
|
341 |
|
342 |
<!-- This section is meant to convey both technical and sociotechnical limitations. -->
|
343 |
|
344 |
-
The Jais family is trained on publicly available data which was in part curated by
|
345 |
|
346 |
The fine-tuned variants are trained as an AI assistant for Arabic and English speakers. Chat models are limited to produce responses for queries in these two languages and may not produce appropriate responses to other language queries.
|
347 |
|
@@ -368,8 +368,8 @@ Through this release, we aim to make LLMs more accessible to Arabic NLP research
|
|
368 |
|
369 |
@article{jaisfamilymodelcard,
|
370 |
title={Jais Family Model Card},
|
371 |
-
author={
|
372 |
year={2024},
|
373 |
-
url = {https://huggingface.co/
|
374 |
}
|
375 |
```
|
|
|
27 |
|
28 |
## Jais Family Details
|
29 |
|
30 |
+
- **Developed by:** Inception, Cerebras Systems.
|
31 |
- **Language(s):** (NLP): Arabic (MSA) and English.
|
32 |
- **Input:** Text only data.
|
33 |
- **Output:** Model generates text.
|
|
|
37 |
|
38 |
| **Pre-trained Model** | **Fine-tuned Model** | **Size (Parameters)** | **Context length (Tokens)** |
|
39 |
|:---------------------|:--------|:-------|:-------|
|
40 |
+
| [jais-family-30b-16k](https://huggingface.co/inceptionai/jais-family-30b-16k) | [Jais-family-30b-16k-chat](https://huggingface.co/inceptionai/jais-family-30b-16k-chat) | 30B | 16,384 |
|
41 |
+
| [jais-family-30b-8k](https://huggingface.co/inceptionai/jais-family-30b-8k) | [Jais-family-30b-8k-chat](https://huggingface.co/inceptionai/jais-family-30b-8k-chat) | 30B | 8,192 |
|
42 |
+
| [jais-family-13b ](https://huggingface.co/inceptionai/jais-family-13b) | [Jais-family-13b-chat](https://huggingface.co/inceptionai/jais-family-13b-chat) | 13B | 2,048 |
|
43 |
+
| [jais-family-6p7b](https://huggingface.co/inceptionai/jais-family-6p7b) | [Jais-family-6p7b-chat](https://huggingface.co/inceptionai/jais-family-6p7b-chat) | 6.7B | 2,048 |
|
44 |
+
| [jais-family-2p7b](https://huggingface.co/inceptionai/jais-family-2p7b) | [Jais-family-2p7b-chat](https://huggingface.co/inceptionai/jais-family-2p7b-chat) | 2.7B | 2,048 |
|
45 |
+
| [jais-family-1p3b](https://huggingface.co/inceptionai/jais-family-1p3b) | [Jais-family-1p3b-chat](https://huggingface.co/inceptionai/jais-family-1p3b-chat) | 1.3B | 2,048 |
|
46 |
+
| [jais-family-590m](https://huggingface.co/inceptionai/jais-family-590m) | [Jais-family-590m-chat](https://huggingface.co/inceptionai/jais-family-590m-chat) | 590M | 2,048 |
|
47 |
|
48 |
| **Adapted pre-trained Model** | **Fine-tuned Model** | **Size (Parameters)** | **Context length (Tokens)** |
|
49 |
|:---------------------|:--------|:-------|:-------|
|
50 |
+
| [jais-adapted-70b](https://huggingface.co/inceptionai/jais-adapted-70b) | [Jais-adapted-70b-chat](https://huggingface.co/inceptionai/jais-adapted-70b-chat) | 70B | 4,096 |
|
51 |
+
| [jais-adapted-13b](https://huggingface.co/inceptionai/jais-adapted-13b) | [Jais-adapted-13b-chat](https://huggingface.co/inceptionai/jais-adapted-13b-chat) | 13B | 4,096 |
|
52 |
+
| [jais-adapted-7b](https://huggingface.co/inceptionai/jais-adapted-7b) | [Jais-adapted-7b-chat](https://huggingface.co/inceptionai/jais-adapted-7b-chat) | 7B | 4,096 |
|
53 |
|
54 |
### Model Architecture:
|
55 |
<a name="model-architecture"></a>
|
|
|
72 |
import torch
|
73 |
from transformers import AutoTokenizer, AutoModelForCausalLM
|
74 |
|
75 |
+
model_path = "inceptionai/jais-family-30b-8k"
|
76 |
|
77 |
device = "cuda" if torch.cuda.is_available() else "cpu"
|
78 |
|
|
|
130 |
|
131 |
| **Pre-trained model** | **English data (tokens)** | **Arabic data (tokens)** | **Code data (tokens)** | **Total data (tokens)** |
|
132 |
|-------------------------|---------------------------|--------------------------|------------------------|------------------------|
|
133 |
+
| [jais-family-30b-16k](https://huggingface.co/inceptionai/jais-family-30b-16k) | 980B | 490B | 196B | 1666B |
|
134 |
+
| [jais-family-30b-8k](https://huggingface.co/inceptionai/jais-family-30b-8k) | 882B | 441B | 177B | 1500B |
|
135 |
+
| [jais-family-13b ](https://huggingface.co/inceptionai/jais-family-13b) | 283B | 141B | 56B | 480B |
|
136 |
+
| [jais-family-6p7b](https://huggingface.co/inceptionai/jais-family-6p7b) | 283B | 141B | 56B | 480B |
|
137 |
+
| [jais-family-2p7b](https://huggingface.co/inceptionai/jais-family-2p7b) | 283B | 141B | 56B | 480B |
|
138 |
+
| [jais-family-1p3b](https://huggingface.co/inceptionai/jais-family-1p3b) | 283B | 141B | 56B | 480B |
|
139 |
+
| [jais-family-590m](https://huggingface.co/inceptionai/jais-family-590m) | 283B | 141B | 56B | 480B |
|
140 |
+
| [jais-adapted-70b](https://huggingface.co/inceptionai/jais-adapted-70b) | 33B | 334B | 4B | 371B |
|
141 |
+
| [jais-adapted-13b](https://huggingface.co/inceptionai/jais-adapted-13b) | 127B | 140B | 13B | 280B |
|
142 |
+
| [jais-adapted-7b](https://huggingface.co/inceptionai/jais-adapted-7b) | 18B | 19B | 2B | 39B |
|
143 |
|
144 |
### Finetuning data
|
145 |
|
|
|
274 |
In the following, we compare the models in this release of the jais family against previously released versions:
|
275 |
|
276 |
<p align="center">
|
277 |
+
<img src="https://huggingface.co/inceptionai/jais-family-30b-16k-chat/resolve/main/jais.png" alt="Jais-adapted GPT-4">
|
278 |
</p>
|
279 |
<p align="center">
|
280 |
<em>GPT-4-as-a-judge evaluation of Jais in Arabic and English. Jais family models are significantly better than previous Jais at generations in both languages. </em>
|
281 |
</p>
|
282 |
|
283 |
<p align="center">
|
284 |
+
<img src="https://huggingface.co/inceptionai/jais-family-30b-16k-chat/resolve/main/jais-adapted.png" alt="Jais-adapted GPT-4">
|
285 |
</p>
|
286 |
<p align="center">
|
287 |
<em>GPT-4-as-a-judge evaluation of adapted Jais in Arabic and English. The generation quality of Arabic is significantly enhanced, while achieving improvement in English when compared to Llama-2 instruct. </em>
|
|
|
290 |
Besides pairwise comparison, we also perform MT-bench style single-answer grading on a scale of 1 to 10.
|
291 |
|
292 |
<p align="center">
|
293 |
+
<img src="https://huggingface.co/inceptionai/jais-family-30b-16k-chat/resolve/main/mt_bench.png" alt="MT-bench">
|
294 |
</p>
|
295 |
<p align="center">
|
296 |
<em>MT-bench style single-answer grading evaluation of Jais and adapted Jais in Arabic and English. Comparisons are made between select corresponding models from earlier releases. The quality ratings of responses are generally improved, with significant enhancements in Arabic.</em>
|
|
|
341 |
|
342 |
<!-- This section is meant to convey both technical and sociotechnical limitations. -->
|
343 |
|
344 |
+
The Jais family is trained on publicly available data which was in part curated by Inception. We have employed different techniques to reduce bias in the model. While efforts have been made to minimize biases, it is likely that the model, as with all LLM models, will exhibit some bias.
|
345 |
|
346 |
The fine-tuned variants are trained as an AI assistant for Arabic and English speakers. Chat models are limited to produce responses for queries in these two languages and may not produce appropriate responses to other language queries.
|
347 |
|
|
|
368 |
|
369 |
@article{jaisfamilymodelcard,
|
370 |
title={Jais Family Model Card},
|
371 |
+
author={Inception},
|
372 |
year={2024},
|
373 |
+
url = {https://huggingface.co/inceptionai/jais-family-30b-16k-chat/blob/main/README.md}
|
374 |
}
|
375 |
```
|