File size: 4,381 Bytes

f1da94c
 
 
903f0bf
 
 
f1da94c
 
eebff12
f1da94c
 
e97a526
b256f36
 
 
 
 
 
 
 
 
 
 
 
 
f1da94c
 
 
 
 
 
 
903f0bf
f1da94c
 
 
 
 
 
 
903f0bf
b256f36
 
 
f1da94c
 
 
 
 
5b349e7
 
 
 
 
 
 
 
 
 
 
b256f36
 
 
 
e025c05
f1da94c
903f0bf
 
f1da94c
 
 
417f579
 
f1da94c
 
 
417f579
 
f1da94c
 
 
417f579
5b349e7
 
 
 
417f579
f1da94c
 
 
417f579
 
 
f1da94c
 
 
 
 
 
 
 
 
 
 
 
 
 
 
620ee42
 
 
 
f1da94c
 
 
 
620ee42
f1da94c
 
b256f36

---
license: apache-2.0
tags:
- text-classification
- depression
- reddit
- generated_from_trainer
datasets:
- mrjunos/depression-reddit-cleaned
metrics:
- accuracy
widget:
- text:
  - >-
    i just found out my boyfriend is depressed i really want to be there for him
    but i feel like i ve only been saying the wrong thing how can i be there for
    him help him and see him get better i m worried it will continue to the
    point it will consume him i can already see his personality changing and i m
    scared for the future what thing can i say or do to comfort or help
  example_title: depression
- text:
  - >-
    i m getting more and more people asking where they can buy the ambients
    album simple answer is quot not yet quot it ll be on itunes eventually
  example_title: not_depression
model-index:
- name: depression-reddit-distilroberta-base
  results:
  - task:
      name: Text Classification
      type: text-classification
    dataset:
      name: mrjunos/depression-reddit-cleaned
      type: depression-reddit-cleaned
      config: default
      split: train
      args: default
    metrics:
    - name: Accuracy
      type: accuracy
      value: 0.9715578539107951
language:
- en
pipeline_tag: text-classification
---

<!-- This model card has been generated automatically according to the information the Trainer had access to. You
should probably proofread and complete it, then remove this comment. -->

## Example Pipeline

```python
from transformers import pipeline
predict_task = pipeline(model="mrjunos/depression-reddit-distilroberta-base", task="text-classification")
predict_task("Stop listing your issues here, use forum instead or open ticket.")
```
```
[{'label': 'not_depression', 'score': 0.9813856482505798}]
```

Disclaimer: This machine learning model classifies texts related to depression, but I am not an expert or a mental health professional. 
I do not intend to diagnose or offer medical advice. The information provided should not replace consultation with a qualified professional. 
The results may not be accurate. Use this model at your own risk and seek professional advice if needed.

This model is a fine-tuned version of [distilroberta-base](https://huggingface.co/distilroberta-base) on the [mrjunos/depression-reddit-cleaned dataset](https://huggingface.co/datasets/mrjunos/depression-reddit-cleaned).
It achieves the following results on the evaluation set:
- Loss: 0.0821
- Accuracy: 0.9716

## Model description

This model is a transformer-based model that has been fine-tuned on a dataset of Reddit posts related to depression.
The model can be used to classify posts as either depression or not depression.

## Intended uses & limitations

This model is intended to be used for research purposes. It is not yet ready for production use.
The model has been trained on a dataset of English-language posts, so it may not be accurate for other languages.

## Training and evaluation data

The model was trained on the mrjunos/depression-reddit-cleaned dataset, which contains approximately 7,000 labeled instances.
The data was split into Train and Test using:
```python
ds = ds['train'].train_test_split(test_size=0.2, seed=42)
```
The dataset consists of two main features: 'text' and 'label'. The 'text' feature contains the text data from Reddit posts related to depression, while the 'label' feature indicates whether a post is classified as depression or not.

## Training procedure

You can find here the steps I followed to train this model:
https://github.com/mrjunos/machine_learning/blob/main/NLP-fine_tunning-hugging_face_model.ipynb

### Training hyperparameters

The following hyperparameters were used during training:
- learning_rate: 5e-05
- train_batch_size: 8
- eval_batch_size: 8
- seed: 42
- optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
- lr_scheduler_type: linear
- num_epochs: 3

### Training results

| Training Loss | Epoch | Step | Validation Loss | Accuracy |
|:-------------:|:-----:|:----:|:---------------:|:--------:|
| 0.1711        | 0.65  | 500  | 0.0821          | 0.9716   |
| 0.1022        | 1.29  | 1000 | 0.1148          | 0.9709   |
| 0.0595        | 1.94  | 1500 | 0.1178          | 0.9787   |
| 0.0348        | 2.59  | 2000 | 0.0951          | 0.9851   |


### Framework versions

- Transformers 4.30.2
- Pytorch 2.0.1+cu118
- Datasets 2.13.0
- Tokenizers 0.13.3