Skolkovo Institute of Science and Technology commited on
Commit
544b646
1 Parent(s): 6e32a9f

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +18 -3
README.md CHANGED
@@ -16,7 +16,7 @@ In this task, the model gets the string with text with the error and the exact s
16
 
17
  ## Model training details
18
 
19
- ### Data
20
 
21
  The data was provided in the following way
22
 
@@ -34,7 +34,7 @@ I want to stop smoking during driving bicycle . 23:29 A <gerund> does not normal
34
 
35
  Grammar termins are highlighted with '< ... >' marks and word examples - with '<< ... >>'
36
 
37
- ### Data preprocessing
38
 
39
  We lowercased the text, split it from any punctuation, including task specific marks (<< >>) and explicitly pointed out the error in the original text using << >>.
40
 
@@ -44,6 +44,11 @@ the smoke < < flow > > < < my > > face . 10:17 When the < verb > < < flow > > is
44
  i want to stop smoking < < during > > driving bicycle . 23:29 a < gerund > does not normally follow the < preposition > < < during > > . think of an expression using the < conjunction > ' while ' instead of a < preposition > .
45
  ```
46
 
 
 
 
 
 
47
 
48
  ## How to use
49
 
@@ -86,4 +91,14 @@ def paraphrase(text, model, temperature=1.0, beams=3):
86
  # expected output: ["a gerund > does not normally follow the preposition > during > >. think of an expression using the conjunction >'while'instead of a preposition >."]
87
 
88
 
89
- ```
 
 
 
 
 
 
 
 
 
 
 
16
 
17
  ## Model training details
18
 
19
+ #### Data
20
 
21
  The data was provided in the following way
22
 
 
34
 
35
  Grammar termins are highlighted with '< ... >' marks and word examples - with '<< ... >>'
36
 
37
+ #### Data preprocessing
38
 
39
  We lowercased the text, split it from any punctuation, including task specific marks (<< >>) and explicitly pointed out the error in the original text using << >>.
40
 
 
44
  i want to stop smoking < < during > > driving bicycle . 23:29 a < gerund > does not normally follow the < preposition > < < during > > . think of an expression using the < conjunction > ' while ' instead of a < preposition > .
45
  ```
46
 
47
+ #### Data augmentation
48
+
49
+ The main feature of our training pipeline was data augmentation. The idea of the augmentation is as follows: we cut the existing text with error after the last word which was syntactically connected to the words inside the error span (syntactic dependencies were automatically parsed with spacy) and this cut version of the text with error was used as a prompt for language model (we used [GPT-Neo 1.3B](https://huggingface.co/EleutherAI/gpt-neo-1.3B)).
50
+
51
+ Using both initial and augmented data we fine-tuned [t5-large](https://huggingface.co/t5-large).
52
 
53
  ## How to use
54
 
 
91
  # expected output: ["a gerund > does not normally follow the preposition > during > >. think of an expression using the conjunction >'while'instead of a preposition >."]
92
 
93
 
94
+ ```
95
+
96
+
97
+ ## Licensing Information
98
+
99
+ [Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License][cc-by-nc-sa].
100
+
101
+ [![CC BY-NC-SA 4.0][cc-by-nc-sa-image]][cc-by-nc-sa]
102
+
103
+ [cc-by-nc-sa]: http://creativecommons.org/licenses/by-nc-sa/4.0/
104
+ [cc-by-nc-sa-image]: https://i.creativecommons.org/l/by-nc-sa/4.0/88x31.png