EZlee commited on
Commit
dd06ceb
1 Parent(s): 0e7f293

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +31 -15
README.md CHANGED
@@ -6,12 +6,6 @@ language:
6
  pipeline_tag: fill-mask
7
  ---
8
 
9
- | Dataset\BERT Pretrain | bert-based-chinese | ckiplab | GufoLab |
10
- | ------------- |:-------------:|:-------------:|:-------------:|
11
- | 5000 Tradition Chinese Dataset |0.7183| 0.6989| **0.8081**|
12
- | 10000 Sol-Idea Dataset | 0.7874| 0.7913| **0.8025**|
13
- | ALL DataSet | 0.7694| 0.7678| **0.8038**|
14
-
15
  ### Model Sources
16
  - **Paper:** [BERT](https://arxiv.org/abs/1810.04805)
17
 
@@ -22,13 +16,6 @@ pipeline_tag: fill-mask
22
  This model can be used for masked language modeling
23
 
24
 
25
-
26
- ## Risks, Limitations and Biases
27
- **CONTENT WARNING: Readers should be aware this section contains content that is disturbing, offensive, and can propagate historical and current stereotypes.**
28
-
29
- Significant research has explored bias and fairness issues with language models (see, e.g., [Sheng et al. (2021)](https://aclanthology.org/2021.acl-long.330.pdf) and [Bender et al. (2021)](https://dl.acm.org/doi/pdf/10.1145/3442188.3445922)).
30
-
31
-
32
  ## Training
33
 
34
  #### Training Procedure
@@ -41,12 +28,41 @@ botp/yentinglin-zh_TW_c4
41
 
42
  ## Evaluation
43
 
44
- #### Results
 
 
 
 
45
 
46
- [More Information Needed]
47
 
 
 
 
 
 
 
48
 
49
  ## How to Get Started With the Model
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
50
  ```python
51
  from transformers import AutoTokenizer, AutoModelForMaskedLM
52
 
 
6
  pipeline_tag: fill-mask
7
  ---
8
 
 
 
 
 
 
 
9
  ### Model Sources
10
  - **Paper:** [BERT](https://arxiv.org/abs/1810.04805)
11
 
 
16
  This model can be used for masked language modeling
17
 
18
 
 
 
 
 
 
 
 
19
  ## Training
20
 
21
  #### Training Procedure
 
28
 
29
  ## Evaluation
30
 
31
+ | Dataset\BERT Pretrain | bert-based-chinese | ckiplab | GufoLab |
32
+ | ------------- |:-------------:|:-------------:|:-------------:|
33
+ | 5000 Tradition Chinese Dataset |0.7183| 0.6989| **0.8081**|
34
+ | 10000 Sol-Idea Dataset | 0.7874| 0.7913| **0.8025**|
35
+ | ALL DataSet | 0.7694| 0.7678| **0.8038**|
36
 
37
+ #### Results
38
 
39
+ | Test ID\Results | [MASK] Input | Result Output |
40
+ | -------------|-------------|-------------|
41
+ | 1|今天禮拜[MASK]?我[MASK]是很想[MASK]班。|今天禮拜六?我不是很想上班。 |
42
+ | 2|[MASK]灣並[MASK]是[MASK]國不可分割的一部分。|臺灣並不是中國不可分割的一部分。 |
43
+ | 3|如果可以是韋[MASK]安的最新歌[MASK]。|如果可以是韋禮安的最新歌曲。 |
44
+ | 4|[MASK]水老[MASK]有賣很多鐵蛋的攤販。|淡水老街有賣很多鐵蛋的攤販。 |
45
 
46
  ## How to Get Started With the Model
47
+ #### Private Model Download
48
+
49
+ **Installation**
50
+ ```
51
+ $ curl -s https://packagecloud.io/install/repositories/github/git-lfs/script.deb.sh | sudo bash
52
+ $ sudo apt-get install git-lfs
53
+ $ git lfs install
54
+ $ pip install huggingface_hub
55
+
56
+ ```
57
+ **Login HuggingFace**
58
+
59
+ ```
60
+ $ huggingface-cli login
61
+ Token:Your own 'write' token.
62
+ ```
63
+
64
+ **Pyhon Code**
65
+
66
  ```python
67
  from transformers import AutoTokenizer, AutoModelForMaskedLM
68