Nick Doiron
commited on
Commit
•
653ddae
1
Parent(s):
7bc3773
readme fix and code sample
Browse files
README.md
CHANGED
@@ -1,3 +1,4 @@
|
|
|
|
1 |
language:
|
2 |
- en
|
3 |
license: apache-2.0
|
@@ -5,6 +6,7 @@ tags:
|
|
5 |
- reddit
|
6 |
datasets:
|
7 |
- georeactor/reddit_one_ups_seq2seq_2014
|
|
|
8 |
|
9 |
# t5-reddit-2014
|
10 |
|
@@ -21,3 +23,15 @@ Training notebook: https://github.com/Georeactor/reddit-one-ups/blob/main/traini
|
|
21 |
- Fine-tuned on first 80% of [georeactor/reddit_one_ups_seq2seq_2014](https://huggingface.co/datasets/georeactor/reddit_one_ups_seq2seq_2014) for one epoch, batch size = 2.
|
22 |
- Loss did not move much during this epoch.
|
23 |
- Future experiments should use a larger model, larger batch size (could easily have done batch_size = 4 on CoLab), full dataset if we are not worried about eval.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
1 |
+
---
|
2 |
language:
|
3 |
- en
|
4 |
license: apache-2.0
|
|
|
6 |
- reddit
|
7 |
datasets:
|
8 |
- georeactor/reddit_one_ups_seq2seq_2014
|
9 |
+
---
|
10 |
|
11 |
# t5-reddit-2014
|
12 |
|
|
|
23 |
- Fine-tuned on first 80% of [georeactor/reddit_one_ups_seq2seq_2014](https://huggingface.co/datasets/georeactor/reddit_one_ups_seq2seq_2014) for one epoch, batch size = 2.
|
24 |
- Loss did not move much during this epoch.
|
25 |
- Future experiments should use a larger model, larger batch size (could easily have done batch_size = 4 on CoLab), full dataset if we are not worried about eval.
|
26 |
+
|
27 |
+
## Inference
|
28 |
+
|
29 |
+
```
|
30 |
+
from transformers import AutoTokenizer, AutoModelForSeq2SeqLM
|
31 |
+
model = AutoModelForSeq2SeqLM.from_pretrained('georeactor/t5-reddit-2014')
|
32 |
+
tokenizer = AutoTokenizer.from_pretrained('georeactor/t5-reddit-2014')
|
33 |
+
|
34 |
+
input = tokenizer.encode('Looks like a potato bug', return_tensors="pt")
|
35 |
+
output = model.generate(input, max_length=256)
|
36 |
+
tokenizer.decode(output[0])
|
37 |
+
```
|