Update README colab notebook
Browse files
README.md
CHANGED
@@ -7,14 +7,16 @@ tags:
|
|
7 |
- 文言文
|
8 |
- ancient
|
9 |
- classical
|
|
|
|
|
10 |
license: cc-by-nc-sa-4.0
|
11 |
---
|
12 |
|
13 |
# BertForSequenceClassification model (Classical Chinese)
|
|
|
14 |
|
15 |
-
This BertForSequenceClassification Classical Chinese model is intended to predict whether a Classical Chinese sentence is a letter title (书信标题) or not. This model is first inherited from the BERT base Chinese model (MLM), and finetuned using a large corpus of Classical Chinese language (3GB textual dataset), then concatenated with the BertForSequenceClassification architecture to perform a binary classification task.
|
16 |
-
|
17 |
-
#### Labels: 0 = non-letter, 1 = letter
|
18 |
|
19 |
## Model description
|
20 |
|
@@ -35,12 +37,13 @@ Note that this model is primiarly aimed at predicting whether a Classical Chines
|
|
35 |
|
36 |
Here is how to use this model to get the features of a given text in PyTorch:
|
37 |
|
38 |
-
1. Import model
|
39 |
```python
|
40 |
from transformers import BertTokenizer
|
41 |
from transformers import BertForSequenceClassification
|
42 |
import torch
|
43 |
from numpy import exp
|
|
|
44 |
|
45 |
tokenizer = BertTokenizer.from_pretrained('bert-base-chinese')
|
46 |
model = BertForSequenceClassification.from_pretrained('cbdb/ClassicalChineseLetterClassification',
|
|
|
7 |
- 文言文
|
8 |
- ancient
|
9 |
- classical
|
10 |
+
- letter
|
11 |
+
- 书信标题
|
12 |
license: cc-by-nc-sa-4.0
|
13 |
---
|
14 |
|
15 |
# BertForSequenceClassification model (Classical Chinese)
|
16 |
+
[![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/drive/1jVu2LrNwkLolItPALKGNjeT6iCfzF8Ic?usp=sharing/)
|
17 |
|
18 |
+
This BertForSequenceClassification Classical Chinese model is intended to predict whether a Classical Chinese sentence is a letter title (书信标题) or not. This model is first inherited from the BERT base Chinese model (MLM), and finetuned using a large corpus of Classical Chinese language (3GB textual dataset), then concatenated with the BertForSequenceClassification architecture to perform a binary classification task.
|
19 |
+
* Labels: 0 = non-letter, 1 = letter
|
|
|
20 |
|
21 |
## Model description
|
22 |
|
|
|
37 |
|
38 |
Here is how to use this model to get the features of a given text in PyTorch:
|
39 |
|
40 |
+
1. Import model and packages
|
41 |
```python
|
42 |
from transformers import BertTokenizer
|
43 |
from transformers import BertForSequenceClassification
|
44 |
import torch
|
45 |
from numpy import exp
|
46 |
+
import numpy as np
|
47 |
|
48 |
tokenizer = BertTokenizer.from_pretrained('bert-base-chinese')
|
49 |
model = BertForSequenceClassification.from_pretrained('cbdb/ClassicalChineseLetterClassification',
|