ai-forever
commited on
Commit
•
0c8f4e6
1
Parent(s):
f1f37bb
Update README.md
Browse files
README.md
CHANGED
@@ -155,26 +155,36 @@ We compare our solution with both open automatic spell checkers and the ChatGPT
|
|
155 |
| Model | Pr. (spell) | Rec. (spell) | F1 (spell) | Pr. (punc) | Rec. (punc) | F1 (punc) | Pr. (case) | Rec. (case) | F1 (case) |
|
156 |
| --- | --- | --- | --- | --- | --- | --- | --- | --- | --- |
|
157 |
| sage-fredt5-large | x | x | x | x | x | x | x | x | x |
|
158 |
-
| sage-fredt5-large (ft) |
|
159 |
-
| sage-ai-service |
|
160 |
-
| gpt-3.5-turbo |
|
161 |
-
| gpt-4 |
|
162 |
-
|
163 |
-
& \textbf{70.8} & \textbf{56.3} & \textbf{62.7} & \textbf{48.9} & 35.8 & 41.4 & 32.9 & 45.3 & 38.1
|
164 |
|
165 |
## How to use
|
166 |
```python
|
167 |
-
from transformers import
|
168 |
-
|
169 |
-
|
170 |
-
|
171 |
-
|
172 |
-
|
173 |
-
|
174 |
-
|
175 |
-
|
176 |
-
|
177 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
178 |
```
|
179 |
|
180 |
## Resources
|
|
|
155 |
| Model | Pr. (spell) | Rec. (spell) | F1 (spell) | Pr. (punc) | Rec. (punc) | F1 (punc) | Pr. (case) | Rec. (case) | F1 (case) |
|
156 |
| --- | --- | --- | --- | --- | --- | --- | --- | --- | --- |
|
157 |
| sage-fredt5-large | x | x | x | x | x | x | x | x | x |
|
158 |
+
| sage-fredt5-large (ft) | 67.5 | 53.2 | 59.5 | 48.5 | 38.0 | 42.6 | 37.3 | 50.0 | 42.7 |
|
159 |
+
| sage-ai-service | 70.8 | 56.3 | 62.7 | 48.9 | 35.8 | 41.4 | 32.9 | 45.3 | 38.1|
|
160 |
+
| gpt-3.5-turbo | 23.7 | 38.7 | 29.4 | 37.6 | 23.3 | 28.7 | 19.6 | 35.9 | 25.3 |
|
161 |
+
| gpt-4 | 27.0 | 52.8 | 35.7 | 45.9 | 32.6 | 38.2 | 25.7 | 36.8 | 30.2 |
|
|
|
|
|
162 |
|
163 |
## How to use
|
164 |
```python
|
165 |
+
from transformers import AutoTokenizer, AutoModelForSeq2SeqLM
|
166 |
+
|
167 |
+
tokenizer = AutoTokenizer.from_pretrained("ai-forever/sage-fredt5-large")
|
168 |
+
model = AutoModelForSeq2SeqLM.from_pretrained("ai-forever/sage-fredt5-large")
|
169 |
+
|
170 |
+
model.to("cuda:0")
|
171 |
+
|
172 |
+
sentence = "И не чсно прохожим в этот день непогожйи почему я веселый такйо"
|
173 |
+
text = "<LM>" + sentence
|
174 |
+
with torch.inference_mode():
|
175 |
+
encodings = tokenizer(text, max_length=None, padding="longest", truncation=False, return_tensors="pt")
|
176 |
+
for k, v in encodings.items():
|
177 |
+
encodings[k] = v.to("cuda:0")
|
178 |
+
res = model.generate(
|
179 |
+
**encodings,
|
180 |
+
use_cache=True,
|
181 |
+
max_length = encodings["input_ids"].size(1) * 1.5
|
182 |
+
)
|
183 |
+
res = res.cpu().tolist()
|
184 |
+
res = tokenizer.batch_decode(res, skip_special_tokens=True)
|
185 |
+
print(res)
|
186 |
+
|
187 |
+
# ["И не ясно прохожим в этот день непогожий, почему я веселый такой."]
|
188 |
```
|
189 |
|
190 |
## Resources
|