File size: 2,005 Bytes
e895b50
 
1348a11
 
 
1a29dda
 
 
 
0f0fc6b
 
e895b50
1348a11
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
edd72d3
 
1348a11
edd72d3
 
 
1348a11
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
edd72d3
 
1348a11
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
---
license: apache-2.0
language: en
datasets:
- sst2
metrics:
- precision
- recall
- f1
tags:
- text-classification
---

# T5-base fine-tuned for Sentiment Analysis πŸ‘πŸ‘Ž


[Google's T5](https://ai.googleblog.com/2020/02/exploring-transfer-learning-with-t5.html) base fine-tuned on [SST-2](https://huggingface.co/datasets/st2) dataset for **Sentiment Analysis** downstream task.

## Details of T5

The **T5** model was presented in [Exploring the Limits of Transfer Learning with a Unified Text-to-Text Transformer](https://arxiv.org/pdf/1910.10683.pdf) by *Colin Raffel, Noam Shazeer, Adam Roberts, Katherine Lee, Sharan Narang, Michael Matena, Yanqi Zhou, Wei Li, Peter J. Liu*

## Model fine-tuning πŸ‹οΈβ€

The model has been finetuned for 10 epochs on standard hyperparameters


## Val set metrics 🧾

               |precision | recall  | f1-score |support|
    |----------|----------|---------|----------|-------|
    |negative  |     0.95 |     0.95|      0.95|   428 |
    |positive  |     0.94 |     0.96|      0.95|   444 |
    |----------|----------|---------|----------|-------|
    |accuracy|            |         |      0.95|   872 |
    |macro avg|       0.95|     0.95|      0.95|   872 |
    |weighted avg|    0.95|     0.95|     0.95 |   872 |


## Model in Action πŸš€

```python
from transformers import T5Tokenizer, T5ForConditionalGeneration

tokenizer = T5Tokenizer.from_pretrained("t5-finetune-sst2")
model = T5ForConditionalGeneration.from_pretrained("t5-finetune-sst2")

def get_sentiment(text):

    inputs = tokenizer("sentiment: " + text, max_length=128, truncation=True, return_tensors="pt").input_ids
    preds = model.generate(inputs)
    decoded_preds = tokenizer.batch_decode(sequences=preds, skip_special_tokens=True)

    return decoded_preds

get_sentiment("This movie is awesome")

# labels are 'p' for 'positive' and 'n' for 'negative'
# Output: ['p']
```

> This model card is based on "mrm8488/t5-base-finetuned-imdb-sentiment" by Manuel Romero/@mrm8488