File size: 2,099 Bytes
f203223 7518679 f203223 c40dc5b ffef349 c40dc5b 1835446 c40dc5b 27f7bf6 |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 |
---
tags:
- spacy
- text-classification
language:
- en
license: mit
model-index:
- name: en_textcat_goemotions
results: []
---
# 🪐 spaCy Project: Categorization of emotions in Reddit posts (Text Classification) This project uses spaCy to train a text classifier on the [GoEmotions dataset](https://github.com/google-research/google-research/tree/master/goemotions)
| Feature | Description |
| --- | --- |
| **Name** | `en_textcat_goemotions` |
| **Version** | `0.0.1` |
| **spaCy** | `>=3.1.1,<3.2.0` |
| **Default Pipeline** | `transformer`, `textcat_multilabel` |
| **Components** | `transformer`, `textcat_multilabel` |
| **Vectors** | 0 keys, 0 unique vectors (0 dimensions) |
| **Sources** | [GoEmotions dataset](https://github.com/google-research/google-research/tree/master/goemotions) |
| **License** | `MIT` |
| **Author** | [Explosion](explosion.ai) |
> The dataset that this model is trained on has known flaws described [here](https://github.com/google-research/google-research/tree/master/goemotions#disclaimer) as well as label errors resulting from [annotator disagreement](https://www.youtube.com/watch?v=khZ5-AN-n2Y). Anyone using this model should be aware of these limitations of the dataset.
### Label Scheme
<details>
<summary>View label scheme (28 labels for 1 components)</summary>
| Component | Labels |
| --- | --- |
| **`textcat_multilabel`** | `admiration`, `amusement`, `anger`, `annoyance`, `approval`, `caring`, `confusion`, `curiosity`, `desire`, `disappointment`, `disapproval`, `disgust`, `embarrassment`, `excitement`, `fear`, `gratitude`, `grief`, `joy`, `love`, `nervousness`, `optimism`, `pride`, `realization`, `relief`, `remorse`, `sadness`, `surprise`, `neutral` |
</details>
### Accuracy
| Type | Score |
| --- | --- |
| `CATS_SCORE` | 90.22 |
| `CATS_MICRO_P` | 66.67 |
| `CATS_MICRO_R` | 47.81 |
| `CATS_MICRO_F` | 55.68 |
| `CATS_MACRO_P` | 55.00 |
| `CATS_MACRO_R` | 41.93 |
| `CATS_MACRO_F` | 46.29 |
| `CATS_MACRO_AUC` | 90.22 |
| `CATS_MACRO_AUC_PER_TYPE` | 0.00 |
| `TRANSFORMER_LOSS` | 83.51 |
| `TEXTCAT_MULTILABEL_LOSS` | 4549.84 |
|