File size: 1,261 Bytes
e2f5204
 
 
 
 
 
 
 
 
 
 
 
e49eb71
 
 
2128c3c
 
826116f
 
5eacccf
 
 
 
 
 
 
826116f
 
 
 
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
---
license: apache-2.0
datasets:
- CodeTed/CGEDit_dataset
language:
- zh
metrics:
- accuracy
library_name: transformers
tags:
- CGED
- CSC
pipeline_tag: text2text-generation
---
# CGEDit - Chinese Grammatical Error Diagnosis by Task-Specific Instruction Tuning

This model was obtained by fine-tuning the corresponding `ClueAI/PromptCLUE-base-v1-5` model on the CoEdIT dataset. 
![CGEDit_model.png](https://cdn-uploads.huggingface.co/production/uploads/64c7473f513a7fa7c32e153b/AtlsZUWz86rKyb_9EWlDa.png)

## Model Details
### Model Description
- Language(s) (NLP): `Chinese`
- Finetuned from model: `ClueAI/PromptCLUE-base-v1-5`
### Model Sources
- Repository: [https://github.com/TedYeh/Chinese_spelling_Correction](https://github.com/TedYeh/Chinese_spelling_Correction)

## Usage
```python
from transformers import AutoTokenizer, T5ForConditionalGeneration

tokenizer = AutoTokenizer.from_pretrained("CodeTed/CGEDit")
model = T5ForConditionalGeneration.from_pretrained("CodeTed/CGEDit")
input_text = '糾正句子裡的錯字: 看完那段文張,我是反對的!'
input_ids = tokenizer(input_text, return_tensors="pt").input_ids
outputs = model.generate(input_ids, max_length=256)
edited_text = tokenizer.decode(outputs[0], skip_special_tokens=True)
```