Update README.md
Browse files
README.md
CHANGED
@@ -38,6 +38,7 @@ BLOOM-zh is trained extendedly on large amount of Traditional Chinese text data.
|
|
38 |
* **License:** MEDIATEK RESEARCH License ([link](https://huggingface.co/ckip-joint/bloom-1b1-zh/blob/main/LICENSE_MR.md)) and RAIL License v1.0 ([link](https://huggingface.co/spaces/bigscience/license))
|
39 |
* **Release Date Estimate:** Wednesday, 22.February.2023
|
40 |
* **Send Questions to:** info@mtkresearch.com
|
|
|
41 |
* **Cite as:** MediaTek Research: Traditional Chinese-enhanced BLOOM language model. International, February 2023.
|
42 |
* **Organizations of contributors:**
|
43 |
* MediaTek Research
|
@@ -64,7 +65,7 @@ For the uses of the model, please refer to [BLOOM](https://huggingface.co/bigsci
|
|
64 |
## Training Data
|
65 |
*This section provides a high-level overview of the training data. It is relevant for anyone who wants to know the basics of what the model is learning.*
|
66 |
|
67 |
-
We trained the 1B1 parameter model on a total of 6 Billion tokens of mostly high quality Traditional Chinese text. Details are provided in the [paper
|
68 |
|
69 |
## Risks and Limitations
|
70 |
*This section identifies foreseeable harms and misunderstandings.*
|
|
|
38 |
* **License:** MEDIATEK RESEARCH License ([link](https://huggingface.co/ckip-joint/bloom-1b1-zh/blob/main/LICENSE_MR.md)) and RAIL License v1.0 ([link](https://huggingface.co/spaces/bigscience/license))
|
39 |
* **Release Date Estimate:** Wednesday, 22.February.2023
|
40 |
* **Send Questions to:** info@mtkresearch.com
|
41 |
+
* **Paper:** [paper](https://arxiv.org/abs/2303.04715)
|
42 |
* **Cite as:** MediaTek Research: Traditional Chinese-enhanced BLOOM language model. International, February 2023.
|
43 |
* **Organizations of contributors:**
|
44 |
* MediaTek Research
|
|
|
65 |
## Training Data
|
66 |
*This section provides a high-level overview of the training data. It is relevant for anyone who wants to know the basics of what the model is learning.*
|
67 |
|
68 |
+
We trained the 1B1 parameter model on a total of 6 Billion tokens of mostly high quality Traditional Chinese text. Details are provided in the [paper](https://arxiv.org/abs/2303.04715).
|
69 |
|
70 |
## Risks and Limitations
|
71 |
*This section identifies foreseeable harms and misunderstandings.*
|