omkarthawakar
commited on
Commit
•
7dbe7be
1
Parent(s):
858e5f8
Update README.md
Browse files
README.md
CHANGED
@@ -46,6 +46,18 @@ print(tokenizer.batch_decode(outputs[:, input_ids.shape[1]:-1])[0].strip())
|
|
46 |
|
47 |
```
|
48 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
49 |
## Hyperparameters
|
50 |
| Hyperparameter | Value |
|
51 |
| ----------- | ----------- |
|
@@ -79,4 +91,8 @@ print(tokenizer.batch_decode(outputs[:, input_ids.shape[1]:-1])[0].strip())
|
|
79 |
Given the nature of the training data, the MobiLlama-05B model is best suited for prompts using the QA format, the chat format, and the code format.
|
80 |
|
81 |
## Citation
|
82 |
-
|
|
|
|
|
|
|
|
|
|
46 |
|
47 |
```
|
48 |
|
49 |
+
## Training DataMix
|
50 |
+
| Subset | Tokens (Billion) |
|
51 |
+
| ----------- | ----------- |
|
52 |
+
| Arxiv | 30.00 |
|
53 |
+
| Book | 28.86 |
|
54 |
+
| C4 | 197.67 |
|
55 |
+
| Refined-Web | 665.01 |
|
56 |
+
| StarCoder | 291.92 |
|
57 |
+
| StackExchange | 21.75 |
|
58 |
+
| Wikipedia | 23.90 |
|
59 |
+
| Total | 1259.13 |
|
60 |
+
|
61 |
## Hyperparameters
|
62 |
| Hyperparameter | Value |
|
63 |
| ----------- | ----------- |
|
|
|
91 |
Given the nature of the training data, the MobiLlama-05B model is best suited for prompts using the QA format, the chat format, and the code format.
|
92 |
|
93 |
## Citation
|
94 |
+
**BibTeX:**
|
95 |
+
|
96 |
+
```bibtex
|
97 |
+
coming soon
|
98 |
+
```
|