gotutiyan commited on
Commit
4074b02
1 Parent(s): ef6014e

Upload README.md with huggingface_hub

Browse files
Files changed (1) hide show
  1. README.md +45 -0
README.md ADDED
@@ -0,0 +1,45 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ language: en
3
+ license: mit
4
+ tags:
5
+ - GECToR_gotutiyan
6
+ ---
7
+
8
+ # gector sample
9
+ This is an unofficial pretrained model of GECToR ([Omelianchuk+ 2020](https://aclanthology.org/2020.bea-1.16/)).
10
+
11
+ ### How to use
12
+ The code is avaliable from https://github.com/gotutiyan/gector.
13
+
14
+ CLI
15
+ ```sh
16
+ python predict.py --input <raw text file> --restore_dir gotutiyan/gector-roberta-base-5k --out <path to output file>
17
+ ```
18
+
19
+ API
20
+ ```py
21
+ from transformers import AutoTokenizer
22
+ from gector.modeling import GECToR
23
+ from gector.predict import predict, load_verb_dict
24
+ import torch
25
+
26
+ model_id = 'gotutiyan/gector-roberta-base-5k'
27
+ model = GECToR.from_pretrained(model_id)
28
+ if torch.cuda.is_available():
29
+ model.cuda()
30
+ tokenizer = AutoTokenizer.from_pretrained(model_id)
31
+ encode, decode = load_verb_dict('data/verb-form-vocab.txt')
32
+ srcs = [
33
+ 'This is a correct sentence.',
34
+ 'This are a wrong sentences'
35
+ ]
36
+ corrected = predict(
37
+ model, tokenizer, srcs,
38
+ encode, decode,
39
+ keep_confidence=0.0,
40
+ min_error_prob=0.0,
41
+ n_iteration=5,
42
+ batch_size=2,
43
+ )
44
+ print(corrected)
45
+ ```