Update model description in README.md
Browse filesInserted a general introduction of the model and the model's intended use.
README.md
CHANGED
@@ -1,3 +1,23 @@
|
|
1 |
---
|
2 |
license: mit
|
3 |
---
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
1 |
---
|
2 |
license: mit
|
3 |
---
|
4 |
+
##**BertForSequenceClassification model (Classical Chinese)**
|
5 |
+
|
6 |
+
This BertForSequenceClassification Classical Chinese model is intended to predict whether a Classical Chinese sentence is a letter title (书信标题) or not. This model is first inherited from the BERT base Chinese model (MLM), and finetuned using a large corpus of Classical Chinese language (3GB textual dataset), then concatenated with the BertForSequenceClassification architecture to perform a binary classification task.
|
7 |
+
|
8 |
+
**Labels: 0 = non-letter, 1 = letter**
|
9 |
+
|
10 |
+
##**Model description**
|
11 |
+
|
12 |
+
The BertForSequenceClassification model architecture inherits the BERT base model and concatenates a fully-connected linear layer to perform a binary-class classification task.
|
13 |
+
|
14 |
+
**Masked language modeling (MLM):** The masked language modeling architecture randomly masks 15% of the words in the inputs, and the model is trained to predict the masked words. The BERT base model uses this MLM architecture and is pre-trained on a large corpus of data. BERT is proven to produce robust word embedding and can capture rich contextual and semantic relationships. Our model inherits the publicly available pre-trained BERT Chinese model trained on modern Chinese data. To perform a Classical Chinese letter classification task, we first finetuned the model using a large corpus of Classical Chinese data (3GB textual data), and then connected it to the BertForSequenceClassification architecture for Classical Chinese letter classification.
|
15 |
+
|
16 |
+
**Sequence classification:** the model concatenates a fully-connected linear layer to output the probability of each class. In our binary classification task, the final linear layer has two classes.
|
17 |
+
|
18 |
+
##**Intended uses & limitations**
|
19 |
+
Note that this model is primiarly aimed at predicting whether a Classical Chinese sentence is a letter title (书信标题) or not.
|
20 |
+
|
21 |
+
##**How to use**
|
22 |
+
You can use this model directly with a pipeline for masked language modeling:
|
23 |
+
|