indiejoseph commited on
Commit
d9b4729
1 Parent(s): b130793

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +10 -2
README.md CHANGED
@@ -1,9 +1,17 @@
1
  ---
 
2
  tags:
3
  - generated_from_trainer
4
  model-index:
5
  - name: bert-base-cantonese
6
  results: []
 
 
 
 
 
 
 
7
  ---
8
 
9
  <!-- This model card has been generated automatically according to the information the Trainer had access to. You
@@ -11,11 +19,11 @@ should probably proofread and complete it, then remove this comment. -->
11
 
12
  # bert-base-cantonese
13
 
14
- This model was trained from scratch on an unknown dataset.
15
 
16
  ## Model description
17
 
18
- More information needed
19
 
20
  ## Intended uses & limitations
21
 
 
1
  ---
2
+ base_model: bert-base-chinese
3
  tags:
4
  - generated_from_trainer
5
  model-index:
6
  - name: bert-base-cantonese
7
  results: []
8
+ license: cc-by-4.0
9
+ language:
10
+ - yue
11
+ pipeline_tag: fill-mask
12
+ widget:
13
+ - text: 香港原本[MASK]一個人煙稀少嘅漁港。
14
+ example_title: 係
15
  ---
16
 
17
  <!-- This model card has been generated automatically according to the information the Trainer had access to. You
 
19
 
20
  # bert-base-cantonese
21
 
22
+ This model is a continue pre-train version of bert-base-chinese on Cantonese Common Crawl dataset with 198m tokens.
23
 
24
  ## Model description
25
 
26
+ This model has extended 500 more Chinese characters which very common in Cantonese, such as 冧, 噉, 麪, 笪, 冚, 乸 etc.
27
 
28
  ## Intended uses & limitations
29