KES commited on
Commit
788eb3a
1 Parent(s): 64f7cf4

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +70 -70
README.md CHANGED
@@ -1,70 +1,70 @@
1
- ---
2
-
3
- language: en
4
-
5
- tags:
6
-
7
- - sentence correction
8
-
9
- - text2text-generation
10
-
11
- license: cc-by-nc-sa-4.0
12
-
13
- datasets:
14
-
15
- - jfleg
16
-
17
- ---
18
-
19
- # Model
20
- This model utilises T5-base sentence correction pre-trained model. It was fine tuned using a modified version of the [JFLEG](https://arxiv.org/abs/1702.04066) dataset and [Happy Transformer framework](https://github.com/EricFillion/happy-transformer). This model was pre-trained for educational purposes only for correction on local Caribbean dialect. For more on Caribbean dialect checkout the library [Caribe](https://pypi.org/project/Caribe/).
21
- .
22
- ___
23
-
24
-
25
- # Re-training/Fine Tuning
26
-
27
- The results of fine-tuning resulted in a final accuracy of 90%
28
-
29
-
30
- # Usage
31
-
32
-
33
-
34
- ```python
35
-
36
- from happytransformer import HappyTextToText, TTSettings
37
-
38
- pre_trained_model="T5"
39
- model = HappyTextToText(pre_trained_model, "KES/T5-KES")
40
-
41
- arguments = TTSettings(num_beams=4, min_length=1)
42
- sentence = "Wat iz your nam"
43
-
44
- correction = model.generate_text("grammar: "+sentence, args=arguments)
45
- if(correction.text.find(" .")):
46
- correction.text=correction.text.replace(" .", ".")
47
-
48
- print(correction.text) # Correction: "What is your name?".
49
-
50
- ```
51
- _
52
- # Usage with Transformers
53
-
54
- ```python
55
-
56
- from transformers import AutoTokenizer, AutoModelForSeq2SeqLM
57
-
58
- tokenizer = AutoTokenizer.from_pretrained("KES/T5-KES")
59
-
60
- model = AutoModelForSeq2SeqLM.from_pretrained("KES/T5-KES")
61
-
62
- text = "I am lived with my parenmts "
63
- inputs = tokenizer("grammar:"+text, truncation=True, return_tensors='pt')
64
-
65
- output = model.generate(inputs['input_ids'], num_beams=4, max_length=512, early_stopping=True)
66
- correction=tokenizer.batch_decode(output, skip_special_tokens=True)
67
- print("".join(correction)) #Correction: I am living with my parents.
68
-
69
- ```
70
-
 
1
+ ---
2
+
3
+ language: en
4
+
5
+ tags:
6
+
7
+ - sentence correction
8
+
9
+ - text2text-generation
10
+
11
+ license: cc-by-nc-sa-4.0
12
+
13
+ datasets:
14
+
15
+ - jfleg
16
+
17
+ ---
18
+
19
+ # Model
20
+ This model utilises T5-base sentence correction pre-trained model. It was fine tuned using a modified version of the [JFLEG](https://arxiv.org/abs/1702.04066) dataset and [Happy Transformer framework](https://github.com/EricFillion/happy-transformer). This model was pre-trained for educational purposes only for correction on local Caribbean English Creole. For more on the Caribbean English Creole checkout the library [Caribe](https://pypi.org/project/Caribe/).
21
+ .
22
+ ___
23
+
24
+
25
+ # Re-training/Fine Tuning
26
+
27
+ The results of fine-tuning resulted in a final accuracy of 90%
28
+
29
+
30
+ # Usage
31
+
32
+
33
+
34
+ ```python
35
+
36
+ from happytransformer import HappyTextToText, TTSettings
37
+
38
+ pre_trained_model="T5"
39
+ model = HappyTextToText(pre_trained_model, "KES/T5-KES")
40
+
41
+ arguments = TTSettings(num_beams=4, min_length=1)
42
+ sentence = "Wat iz your nam"
43
+
44
+ correction = model.generate_text("grammar: "+sentence, args=arguments)
45
+ if(correction.text.find(" .")):
46
+ correction.text=correction.text.replace(" .", ".")
47
+
48
+ print(correction.text) # Correction: "What is your name?".
49
+
50
+ ```
51
+ _
52
+ # Usage with Transformers
53
+
54
+ ```python
55
+
56
+ from transformers import AutoTokenizer, AutoModelForSeq2SeqLM
57
+
58
+ tokenizer = AutoTokenizer.from_pretrained("KES/T5-KES")
59
+
60
+ model = AutoModelForSeq2SeqLM.from_pretrained("KES/T5-KES")
61
+
62
+ text = "I am lived with my parenmts "
63
+ inputs = tokenizer("grammar:"+text, truncation=True, return_tensors='pt')
64
+
65
+ output = model.generate(inputs['input_ids'], num_beams=4, max_length=512, early_stopping=True)
66
+ correction=tokenizer.batch_decode(output, skip_special_tokens=True)
67
+ print("".join(correction)) #Correction: I am living with my parents.
68
+
69
+ ```
70
+