SIKU-BERT
/

sikubert

classical chinese

literary chinese

ancient chinese

Model card Files Files and versions Community

zihezhu commited on May 8, 2021

Commit

cca24f6

•

1 Parent(s): 2f88645

Update README.md

Files changed (1) hide show

REDAME.md +27 -0

REDAME.md ADDED Viewed

	@@ -0,0 +1,27 @@

+---
+language:
+- "zh"
+thumbnail: "https://raw.githubusercontent.com/SIKU-BERT/SikuBERT/main/appendix/sikubert.png"
+tags:
+- "chinese"
+- "classical chinese"
+- "literary chinese"
+- "ancient chinese"
+- "bert"
+- "roberta"
+- "pytorch"
+license: "apache-2.0"
+---
+# SikuBERT
+## Model description
+![SikuBERT](https://raw.githubusercontent.com/SIKU-BERT/SikuBERT/main/appendix/sikubert.png)
+Digital humanities research needs the support of large-scale corpus and high-performance ancient Chinese natural language processing tools. The pre-training language model has greatly improved the accuracy of text mining in English and modern Chinese texts. At present, there is an urgent need for a pre-training model specifically for the automatic processing of ancient texts. We used the verified high-quality “Siku Quanshu” full-text corpus as the training set, based on the BERT deep language model architecture, we constructed the SikuBERT and SikuRoBERTa pre-training language models for intelligent processing tasks of ancient Chinese.
+## How to use
+```python
+from transformers import AutoTokenizer, AutoModel
+tokenizer = AutoTokenizer.from_pretrained("SIKU-BERT/sikubert")
+model = AutoModel.from_pretrained("SIKU-BERT/sikubert")
+```
+## About Us
+We are from Nanjing Agricultural University.
+> Created with by SIKU-BERT [![Github icon](https://cdn0.iconfinder.com/data/icons/octicons/1024/mark-github-32.png)](https://github.com/SIKU-BERT)