internetoftim
commited on
Commit
•
a7b1691
1
Parent(s):
2669aec
Update README.md
Browse files
README.md
CHANGED
@@ -25,14 +25,15 @@ It was trained with two objectives in pretraining : Masked language modeling(MLM
|
|
25 |
|
26 |
It reduces the need of many engineering efforts for building task specific architectures through pre-trained representation. And achieves state-of-the-art performance on a large suite of sentence-level and token-level tasks.
|
27 |
|
28 |
-
|
29 |
-
## Training and evaluation data
|
30 |
-
|
31 |
This model is a pre-trained BERT-Base trained in two phases on the [Graphcore/wikipedia-bert-128](https://huggingface.co/datasets/Graphcore/wikipedia-bert-128) and [Graphcore/wikipedia-bert-512](https://huggingface.co/datasets/Graphcore/wikipedia-bert-512) datasets.
|
32 |
|
33 |
It was trained on a Graphcore IPU-POD16 using [`optimum-graphcore`](https://github.com/huggingface/optimum-graphcore).
|
34 |
Graphcore and Hugging Face are working together to make training of Transformer models on IPUs fast and easy. Learn more about how to take advantage of the power of Graphcore IPUs to train Transformers models at [hf.co/hardware/graphcore](https://huggingface.co/hardware/graphcore).
|
35 |
|
|
|
|
|
|
|
36 |
Trained on wikipedia datasets:
|
37 |
- [Graphcore/wikipedia-bert-128](https://huggingface.co/datasets/Graphcore/wikipedia-bert-128)
|
38 |
- [Graphcore/wikipedia-bert-512](https://huggingface.co/datasets/Graphcore/wikipedia-bert-512)
|
|
|
25 |
|
26 |
It reduces the need of many engineering efforts for building task specific architectures through pre-trained representation. And achieves state-of-the-art performance on a large suite of sentence-level and token-level tasks.
|
27 |
|
28 |
+
## Intended uses & limitations
|
|
|
|
|
29 |
This model is a pre-trained BERT-Base trained in two phases on the [Graphcore/wikipedia-bert-128](https://huggingface.co/datasets/Graphcore/wikipedia-bert-128) and [Graphcore/wikipedia-bert-512](https://huggingface.co/datasets/Graphcore/wikipedia-bert-512) datasets.
|
30 |
|
31 |
It was trained on a Graphcore IPU-POD16 using [`optimum-graphcore`](https://github.com/huggingface/optimum-graphcore).
|
32 |
Graphcore and Hugging Face are working together to make training of Transformer models on IPUs fast and easy. Learn more about how to take advantage of the power of Graphcore IPUs to train Transformers models at [hf.co/hardware/graphcore](https://huggingface.co/hardware/graphcore).
|
33 |
|
34 |
+
|
35 |
+
## Training and evaluation data
|
36 |
+
|
37 |
Trained on wikipedia datasets:
|
38 |
- [Graphcore/wikipedia-bert-128](https://huggingface.co/datasets/Graphcore/wikipedia-bert-128)
|
39 |
- [Graphcore/wikipedia-bert-512](https://huggingface.co/datasets/Graphcore/wikipedia-bert-512)
|