update
Browse files
README.md
CHANGED
@@ -19,10 +19,10 @@ Pretrained model on oldhan Chinese language using a masked language modeling (ML
|
|
19 |
|
20 |
## Training Datasets
|
21 |
The copyright of the datasets belongs to the Institute of Linguistics, Academia Sinica.
|
22 |
-
* [中央研究院上古漢語標記語料庫](http://lingcorpus.iis.sinica.edu.tw/cgi-bin/kiwi/akiwi/kiwi.sh
|
23 |
-
* [中央研究院中古漢語語料庫](http://lingcorpus.iis.sinica.edu.tw/cgi-bin/kiwi/dkiwi/kiwi.sh
|
24 |
-
* [中央研究院近代漢語語料庫](http://lingcorpus.iis.sinica.edu.tw/cgi-bin/kiwi/pkiwi/kiwi.sh
|
25 |
-
* [中央研究院現代漢語語料庫](http://
|
26 |
|
27 |
## Contributors
|
28 |
* Chin-Tung Lin at [CKIP](https://ckip.iis.sinica.edu.tw/)
|
@@ -36,14 +36,14 @@ The copyright of the datasets belongs to the Institute of Linguistics, Academia
|
|
36 |
AutoModel,
|
37 |
)
|
38 |
|
39 |
-
tokenizer = AutoTokenizer.from_pretrained("ckiplab/
|
40 |
-
model = AutoModel.from_pretrained("ckiplab/
|
41 |
```
|
42 |
|
43 |
* Using our model for inference
|
44 |
```python
|
45 |
>>> from transformers import pipeline
|
46 |
-
>>> unmasker = pipeline('fill-mask', model='ckiplab/
|
47 |
>>> unmasker("黎[MASK]於變時雍。")
|
48 |
|
49 |
[{'sequence': '黎 民 於 變 時 雍 。',
|
19 |
|
20 |
## Training Datasets
|
21 |
The copyright of the datasets belongs to the Institute of Linguistics, Academia Sinica.
|
22 |
+
* [中央研究院上古漢語標記語料庫](http://lingcorpus.iis.sinica.edu.tw/cgi-bin/kiwi/akiwi/kiwi.sh)
|
23 |
+
* [中央研究院中古漢語語料庫](http://lingcorpus.iis.sinica.edu.tw/cgi-bin/kiwi/dkiwi/kiwi.sh)
|
24 |
+
* [中央研究院近代漢語語料庫](http://lingcorpus.iis.sinica.edu.tw/cgi-bin/kiwi/pkiwi/kiwi.sh)
|
25 |
+
* [中央研究院現代漢語語料庫](http://asbc.iis.sinica.edu.tw)
|
26 |
|
27 |
## Contributors
|
28 |
* Chin-Tung Lin at [CKIP](https://ckip.iis.sinica.edu.tw/)
|
36 |
AutoModel,
|
37 |
)
|
38 |
|
39 |
+
tokenizer = AutoTokenizer.from_pretrained("ckiplab/han-bert-base-chinese")
|
40 |
+
model = AutoModel.from_pretrained("ckiplab/han-bert-base-chinese")
|
41 |
```
|
42 |
|
43 |
* Using our model for inference
|
44 |
```python
|
45 |
>>> from transformers import pipeline
|
46 |
+
>>> unmasker = pipeline('fill-mask', model='ckiplab/han-bert-base-chinese')
|
47 |
>>> unmasker("黎[MASK]於變時雍。")
|
48 |
|
49 |
[{'sequence': '黎 民 於 變 時 雍 。',
|