Skolkovo Institute of Science and Technology commited on
Commit
b92b103
1 Parent(s): abb1785

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +17 -42
README.md CHANGED
@@ -14,6 +14,23 @@ This model was trained in terms of [GenChal 2022: Feedback Comment Generation fo
14
 
15
  In this task, the model gets the string with text with the error and the exact span of the error and should return the comment in natural language, which explains the nature of the error.
16
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
17
  ## Model training details
18
 
19
  #### Data
@@ -50,48 +67,6 @@ The main feature of our training pipeline was data augmentation. The idea of the
50
 
51
  Using both initial and augmented data we fine-tuned [t5-large](https://huggingface.co/t5-large).
52
 
53
- ## How to use
54
-
55
- ```python
56
-
57
- from transformers import T5ForConditionalGeneration, AutoTokenizer
58
-
59
- text_with_error = 'I want to stop smoking during driving bicycle .'
60
- error_span = '23:29'
61
-
62
- off1, off2 = list(map(int,error_span.split(":")))
63
- text_with_error_pointed = text_with_error [:off1] + "< < " + re.sub("\s+", " > > < < ", text_with_error [off1:off2].strip()) + " > > " + text_with_error[off2:]
64
- text_with_error_pointed = re.sub("\s+", " ", text_with_error_pointed .strip()).lower()
65
-
66
- tokenizer = AutoTokenizer.from_pretrained("SkolkovoInstitute/GenChal_2022_nigula")
67
- model = T5ForConditionalGeneration.from_pretrained("SkolkovoInstitute/GenChal_2022_nigula").cuda();
68
- model.eval();
69
-
70
- def paraphrase(text, model, temperature=1.0, beams=3):
71
- texts = [text] if isinstance(text, str) else text
72
- inputs = tokenizer(texts, return_tensors='pt', padding=True)['input_ids'].to(model.device)
73
- result = model.generate(
74
- inputs,
75
- # num_return_sequences=n or 1,
76
- do_sample=False,
77
- temperature=temperature,
78
- repetition_penalty=1.1,
79
- max_length=int(inputs.shape[1] * 3) ,
80
- # bad_words_ids=[[2]], # unk
81
- num_beams=beams,
82
- )
83
- texts = [tokenizer.decode(r, skip_special_tokens=True) for r in result]
84
- if isinstance(text, str):
85
- return texts[0]
86
- return texts
87
-
88
-
89
- paraphrase([pointed_example], model)
90
-
91
- # expected output: ["a gerund > does not normally follow the preposition > during > >. think of an expression using the conjunction >'while'instead of a preposition >."]
92
-
93
-
94
- ```
95
 
96
 
97
  ## Licensing Information
 
14
 
15
  In this task, the model gets the string with text with the error and the exact span of the error and should return the comment in natural language, which explains the nature of the error.
16
 
17
+
18
+ ## How to use
19
+
20
+ ```python
21
+ !pip install feedback_generation_nigula
22
+ from feedback_generation_nigula.generator import FeedbackGenerator
23
+
24
+ fg = FeedbackGenerator(cuda_index = 0)
25
+ text = "The smoke flow my face ."
26
+ span = (10,17)
27
+
28
+ fg.get_feedback([text], [span])
29
+
30
+ # expected output ["When the <verb> <<flow>> is used as an <intransitive verb> to express'' to move in a stream'', a <preposition> needs to be placed to indicate the direction"]
31
+
32
+ ```
33
+
34
  ## Model training details
35
 
36
  #### Data
 
67
 
68
  Using both initial and augmented data we fine-tuned [t5-large](https://huggingface.co/t5-large).
69
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
70
 
71
 
72
  ## Licensing Information