zihezhu commited on
Commit
cca24f6
1 Parent(s): 2f88645

Update README.md

Browse files
Files changed (1) hide show
  1. REDAME.md +27 -0
REDAME.md ADDED
@@ -0,0 +1,27 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ language:
3
+ - "zh"
4
+ thumbnail: "https://raw.githubusercontent.com/SIKU-BERT/SikuBERT/main/appendix/sikubert.png"
5
+ tags:
6
+ - "chinese"
7
+ - "classical chinese"
8
+ - "literary chinese"
9
+ - "ancient chinese"
10
+ - "bert"
11
+ - "roberta"
12
+ - "pytorch"
13
+ license: "apache-2.0"
14
+ ---
15
+ # SikuBERT
16
+ ## Model description
17
+ ![SikuBERT](https://raw.githubusercontent.com/SIKU-BERT/SikuBERT/main/appendix/sikubert.png)
18
+ Digital humanities research needs the support of large-scale corpus and high-performance ancient Chinese natural language processing tools. The pre-training language model has greatly improved the accuracy of text mining in English and modern Chinese texts. At present, there is an urgent need for a pre-training model specifically for the automatic processing of ancient texts. We used the verified high-quality “Siku Quanshu” full-text corpus as the training set, based on the BERT deep language model architecture, we constructed the SikuBERT and SikuRoBERTa pre-training language models for intelligent processing tasks of ancient Chinese.
19
+ ## How to use
20
+ ```python
21
+ from transformers import AutoTokenizer, AutoModel
22
+ tokenizer = AutoTokenizer.from_pretrained("SIKU-BERT/sikubert")
23
+ model = AutoModel.from_pretrained("SIKU-BERT/sikubert")
24
+ ```
25
+ ## About Us
26
+ We are from Nanjing Agricultural University.
27
+ > Created with by SIKU-BERT [![Github icon](https://cdn0.iconfinder.com/data/icons/octicons/1024/mark-github-32.png)](https://github.com/SIKU-BERT)