update README text color
Browse files
README.md
CHANGED
@@ -12,13 +12,13 @@ tags:
|
|
12 |
license: cc-by-nc-sa-4.0
|
13 |
---
|
14 |
|
15 |
-
# BertForSequenceClassification model (Classical Chinese)
|
16 |
[![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/drive/1jVu2LrNwkLolItPALKGNjeT6iCfzF8Ic?usp=sharing/)
|
17 |
|
18 |
-
This BertForSequenceClassification Classical Chinese model is intended to predict whether a Classical Chinese sentence is a letter title (书信标题) or not. This model is first inherited from the BERT base Chinese model (MLM), and finetuned using a large corpus of Classical Chinese language (3GB textual dataset), then concatenated with the BertForSequenceClassification architecture to perform a binary classification task.
|
19 |
-
* Labels: 0 = non-letter, 1 = letter
|
20 |
|
21 |
-
## Model description
|
22 |
|
23 |
The BertForSequenceClassification model architecture inherits the BERT base model and concatenates a fully-connected linear layer to perform a binary-class classification task.More precisely, it
|
24 |
was pretrained with two objectives:
|
@@ -27,17 +27,17 @@ was pretrained with two objectives:
|
|
27 |
|
28 |
- Sequence classification: the model concatenates a fully-connected linear layer to output the probability of each class. In our binary classification task, the final linear layer has two classes.
|
29 |
|
30 |
-
## Intended uses & limitations
|
31 |
|
32 |
Note that this model is primiarly aimed at predicting whether a Classical Chinese sentence is a letter title (书信标题) or not.
|
33 |
|
34 |
-
### How to use
|
35 |
|
36 |
Note that this model is primiarly aimed at predicting whether a Classical Chinese sentence is a letter title (书信标题) or not.
|
37 |
|
38 |
Here is how to use this model to get the features of a given text in PyTorch:
|
39 |
|
40 |
-
1. Import model and packages
|
41 |
```python
|
42 |
from transformers import BertTokenizer
|
43 |
from transformers import BertForSequenceClassification
|
@@ -51,7 +51,7 @@ model = BertForSequenceClassification.from_pretrained('cbdb/ClassicalChineseLett
|
|
51 |
output_hidden_states=False)
|
52 |
```
|
53 |
|
54 |
-
2. Make a prediction
|
55 |
```python
|
56 |
max_seq_len = 512
|
57 |
|
@@ -86,7 +86,7 @@ label2idx = {'not-letter': 0,'letter': 1}
|
|
86 |
idx2label = {v:k for k,v in label2idx.items()}
|
87 |
```
|
88 |
|
89 |
-
3. Change your sentence here
|
90 |
```python
|
91 |
label2idx = {'not-letter': 0,'letter': 1}
|
92 |
idx2label = {v:k for k,v in label2idx.items()}
|
@@ -97,8 +97,10 @@ print(f'The predicted probability for the {list(pred_class_proba.keys())[0]} cla
|
|
97 |
print(f'The predicted probability for the {list(pred_class_proba.keys())[1]} class: {list(pred_class_proba.values())[1]}')
|
98 |
>>> The predicted probability for the not-letter class: 0.002029061783105135
|
99 |
>>> The predicted probability for the letter class: 0.9979709386825562
|
100 |
-
|
|
|
101 |
pred_class = idx2label[np.argmax(list(pred_class_proba.values()))]
|
102 |
print(f'The predicted class is: {pred_class}')
|
103 |
>>> The predicted class is: letter
|
104 |
-
```
|
|
|
|
12 |
license: cc-by-nc-sa-4.0
|
13 |
---
|
14 |
|
15 |
+
# <font color="IndianRed"> BertForSequenceClassification model (Classical Chinese) </font>
|
16 |
[![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/drive/1jVu2LrNwkLolItPALKGNjeT6iCfzF8Ic?usp=sharing/)
|
17 |
|
18 |
+
This BertForSequenceClassification Classical Chinese model is intended to predict whether a Classical Chinese sentence is <font color="IndianRed"> a letter title (书信标题) </font> or not. This model is first inherited from the BERT base Chinese model (MLM), and finetuned using a large corpus of Classical Chinese language (3GB textual dataset), then concatenated with the BertForSequenceClassification architecture to perform a binary classification task.
|
19 |
+
* <font color="Salmon"> Labels: 0 = non-letter, 1 = letter </font>
|
20 |
|
21 |
+
## <font color="IndianRed"> Model description </font>
|
22 |
|
23 |
The BertForSequenceClassification model architecture inherits the BERT base model and concatenates a fully-connected linear layer to perform a binary-class classification task.More precisely, it
|
24 |
was pretrained with two objectives:
|
|
|
27 |
|
28 |
- Sequence classification: the model concatenates a fully-connected linear layer to output the probability of each class. In our binary classification task, the final linear layer has two classes.
|
29 |
|
30 |
+
## <font color="IndianRed"> Intended uses & limitations </font>
|
31 |
|
32 |
Note that this model is primiarly aimed at predicting whether a Classical Chinese sentence is a letter title (书信标题) or not.
|
33 |
|
34 |
+
### <font color="IndianRed"> How to use </font>
|
35 |
|
36 |
Note that this model is primiarly aimed at predicting whether a Classical Chinese sentence is a letter title (书信标题) or not.
|
37 |
|
38 |
Here is how to use this model to get the features of a given text in PyTorch:
|
39 |
|
40 |
+
<font color="cornflowerblue"> 1. Import model and packages </font>
|
41 |
```python
|
42 |
from transformers import BertTokenizer
|
43 |
from transformers import BertForSequenceClassification
|
|
|
51 |
output_hidden_states=False)
|
52 |
```
|
53 |
|
54 |
+
<font color="cornflowerblue"> 2. Make a prediction </font>
|
55 |
```python
|
56 |
max_seq_len = 512
|
57 |
|
|
|
86 |
idx2label = {v:k for k,v in label2idx.items()}
|
87 |
```
|
88 |
|
89 |
+
<font color="cornflowerblue"> 3. Change your sentence here </font>
|
90 |
```python
|
91 |
label2idx = {'not-letter': 0,'letter': 1}
|
92 |
idx2label = {v:k for k,v in label2idx.items()}
|
|
|
97 |
print(f'The predicted probability for the {list(pred_class_proba.keys())[1]} class: {list(pred_class_proba.values())[1]}')
|
98 |
>>> The predicted probability for the not-letter class: 0.002029061783105135
|
99 |
>>> The predicted probability for the letter class: 0.9979709386825562
|
100 |
+
```
|
101 |
+
```python
|
102 |
pred_class = idx2label[np.argmax(list(pred_class_proba.values()))]
|
103 |
print(f'The predicted class is: {pred_class}')
|
104 |
>>> The predicted class is: letter
|
105 |
+
```
|
106 |
+
|