File size: 2,099 Bytes

---
tags:
- spacy
- text-classification
language:
- en
license: mit
model-index:
- name: en_textcat_goemotions
  results: []
---
# 🪐 spaCy Project: Categorization of emotions in Reddit posts (Text Classification) This project uses spaCy to train a text classifier on the [GoEmotions dataset](https://github.com/google-research/google-research/tree/master/goemotions)

| Feature | Description |
| --- | --- |
| **Name** | `en_textcat_goemotions` |
| **Version** | `0.0.1` |
| **spaCy** | `>=3.1.1,<3.2.0` |
| **Default Pipeline** | `transformer`, `textcat_multilabel` |
| **Components** | `transformer`, `textcat_multilabel` |
| **Vectors** | 0 keys, 0 unique vectors (0 dimensions) |
| **Sources** | [GoEmotions dataset](https://github.com/google-research/google-research/tree/master/goemotions) |
| **License** | `MIT` |
| **Author** | [Explosion](explosion.ai) |


> The dataset that this model is trained on has known flaws described [here](https://github.com/google-research/google-research/tree/master/goemotions#disclaimer) as well as label errors resulting from [annotator disagreement](https://www.youtube.com/watch?v=khZ5-AN-n2Y). Anyone using this model should be aware of these limitations of the dataset.

### Label Scheme

<details>

<summary>View label scheme (28 labels for 1 components)</summary>

| Component | Labels |
| --- | --- |
| **`textcat_multilabel`** | `admiration`, `amusement`, `anger`, `annoyance`, `approval`, `caring`, `confusion`, `curiosity`, `desire`, `disappointment`, `disapproval`, `disgust`, `embarrassment`, `excitement`, `fear`, `gratitude`, `grief`, `joy`, `love`, `nervousness`, `optimism`, `pride`, `realization`, `relief`, `remorse`, `sadness`, `surprise`, `neutral` |

</details>

### Accuracy

| Type | Score |
| --- | --- |
| `CATS_SCORE` | 90.22 |
| `CATS_MICRO_P` | 66.67 |
| `CATS_MICRO_R` | 47.81 |
| `CATS_MICRO_F` | 55.68 |
| `CATS_MACRO_P` | 55.00 |
| `CATS_MACRO_R` | 41.93 |
| `CATS_MACRO_F` | 46.29 |
| `CATS_MACRO_AUC` | 90.22 |
| `CATS_MACRO_AUC_PER_TYPE` | 0.00 |
| `TRANSFORMER_LOSS` | 83.51 |
| `TEXTCAT_MULTILABEL_LOSS` | 4549.84 |