File size: 5,523 Bytes
7614c9c
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
fa8ebd1
7614c9c
 
fa8ebd1
7614c9c
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
---
license: apache-2.0
datasets:
- poem_sentiment
language:
- en
metrics:
- Accuracy, F1 score
library_name: transformers
pipeline_tag: text-classification
tags:
- text-classification
- sentiment-analysis
- poem-sentiment-detection
- poem-sentiment
- poem-sentiment-classification
- sentiment-classification
widget:
- text: >-
    Rapidly, merrily, Life's sunny hours flit by, Gratefully, cheerily, Enjoy them as they fly!
  example_title: "Life" 
- text: It so happens I am sick of my feet and my nails, and my hair and my shadow. It so happens I am sick of being a man.
  example_title: "Walking Around" 
- text: >-
    No man is an island, Entire of itself, Every man is a piece of the continent, A part of the main.
  example_title: "No man is an island" 
- text: >-
    Some have won a wild delight, By daring wilder sorrow; Could I gain thy love to-night, I'd hazard death to-morrow.
  example_title: "Passion" 
---
 ## AiManatee/RoBERTa_poem_sentiment 
This model is a fine-tuned version of the [FacebookAI/roberta-base](https://huggingface.co/FacebookAI/roberta-base) transformer for the task of poem sentiment analysis. It predicts the sentiment of a given poem verse into one of four categories: negative, positive, no impact, or mixed (positive and negative).

 ### Dataset
RoBERTa_poem_sentiment was trained on the [poem_sentiment](https://huggingface.co/datasets/poem_sentiment) dataset which consists of poem verses across four sentiment labels: negative, positive, no impact, and mixed sentiment. However, the Validation and Test subsets of the original dataset lack 'mixed' sentiment examples. To address this and ensure a thorough evaluation, data augmentation was performed: 32 'mixed' sentiment verses from different English poems were added to the Validation (16) and Test (16) subsets; the original Train subset remained intact. All the augmented samples were tested for semantic consistency, diversity (cosine similarity), length variation and novelty (ensuring the augmented data introduced new, relevant vocabulary). This strategy allowed for a more comprehensive evaluation of the model's generalization ability across all trained labels. The final model was tested on both the original dataset and the augmented dataset.

 #### Labels
```
{0: 'negative', 1: 'positive', 2: 'no_impact', 3: 'mixed'}
```

 ### Training Hyperparameters
```
  learning_rate: 2e-5,
  weight_decay: 0.01,
  batch_size: 16,
  num_epochs: 8,
  optimizer: AdamW: betas=(0.9, 0.999), eps=1e-08
  seed: 16
  early_stopper: min_delta=0.001, patience=3
```
```
  scheduler = ReduceLROnPlateau(
    optimizer,
    mode="min",
    factor=0.5,
    patience=0,
    threshold=0.001,
    eps=1e-8,
  )
```

 ### Model Performance
##### Validation results on the original dataset (class 3 is not being evaluated here)
| Epoch | Training Loss | Validation Loss | Accuracy | F1       |
|-------|---------------|-----------------|----------|----------|
| 1     | 1.365169      | 1.010353        | 0.761905 | 0.771733 |
| 2     | 0.860945      | 0.810045        | 0.723810 | 0.740809 |
| 3     | 0.570005      | 0.637439        | 0.761905 | 0.802184 |
| 4     | 0.355776      | 0.699637        | 0.780952 | 0.797572 |
| 5     | 0.252919      | 0.586395        | 0.847619 | 0.860519 |
| 6     | 0.156633      | 0.610439        | 0.819048 | 0.834072 |
| 7     | 0.084868      | 0.515130        | 0.876190 | 0.884736 |
| 8     | 0.062830      | 0.572643        | 0.885714 | 0.902510 |


 ##### Validation results on the augmented dataset
| Epoch | Training Loss | Validation Loss | Accuracy | F1       |
|-------|---------------|-----------------|----------|----------|
| 1     | 1.365169      | 1.168057        | 0.661157 | 0.628737 |
| 2     | 0.860945      | 0.869521        | 0.694214 | 0.717916 |
| 3     | 0.570005      | 0.637439        | 0.776859 | 0.790842 |
| 4     | 0.355776      | 0.681563        | 0.768595 | 0.776540 |
| 5     | 0.252919      | 0.585692        | 0.834710 | 0.841590 |
| 6     | 0.156633      | 0.542949        | 0.809917 | 0.815361 |
| 7     | 0.092444      | 0.581075        | 0.826446 | 0.830607 |
| 8     | 0.049480      | 0.583749        | 0.884297 | 0.881360 |


 ### How to Use the Model
Here is how to predict the sentiment of a poem verse using this model:

```python
from transformers import pipeline
sentiment_classifier = pipeline(task='text-classification', model='AiManatee/RoBERTa_poem_sentiment')
verse1 = "Rapidly, merrily, Life's sunny hours flit by, Gratefully, cheerily, Enjoy them as they fly!"
verse2 = "It so happens I am sick of my feet and my nails, and my hair and my shadow. It so happens I am sick of being a man."
verse3 = "No man is an island, Entire of itself, Every man is a piece of the continent, A part of the main."
verse4 = "Some have won a wild delight, By daring wilder sorrow; Could I gain thy love to-night, I'd hazard death to-morrow."
print(sentiment_classifier(verse1))
print(sentiment_classifier(verse2))
print(sentiment_classifier(verse3))
print(sentiment_classifier(verse4))
```

 ### Evaluation
 ##### Original dataset
```
{Loss: 0.5726433790155819
Accuracy: 0.8857142857142857
Precision: 0.9201298701298701
Recall: 0.8857142857142857
F1: 0.9025108225108224
}
```

 ##### Augmented dataset
```
{Loss: 0.5837492472492158
Accuracy: 0.8842975206611571
Precision: 0.8810538160090016
Recall: 0.8842975206611571
F1: 0.8813606847697756
}
```

 ### Framework Versions
- **Transformers:** 4.35.2
- **PyTorch:** 2.1.0+cu118
- **Datasets:** 2.16.1
- **Tokenizers:** 0.15.1