fifi_classification / README.md
mjbeattie's picture
Update README.md
614596a verified
metadata
license: apache-2.0
base_model: distilbert/distilbert-base-uncased
tags:
  - generated_from_trainer
metrics:
  - accuracy
model-index:
  - name: fifi_classification
    results: []
datasets:
  - mjbeattie/finditfixit

fifi_classification

First load: April 13, 2024

University of Oklahoma

The city of Seattle uses a app called FindIt-FixIt to gather service requests from residents. The requests routed to the responsible agency for resolution. In 2023, we obtained the detail data from 2018-2023 in an effort to understand how COVID affected city services. This data includes, among other things, detailed text from residents. It also includes the service request type as chosen by the resident. Text details and their corresponding categories are included in the dataset mjbeattie/finditfixit.

This dataset was used to fine-tune distilbert/distilbert-base-uncased to classify text into one of the application's 15 service request types. This model can be used to classify unseen texts.

The model achieves the following results on the evaluation set:

  • Loss: 0.6323
  • Accuracy: 0.7987

Model description

Classifies text into the 15 Seattle service request types.

Intended uses & limitations

Used for reclassifying service requests made prior to the introduction of the SPD-Unauthorized Encampment type.

Training and evaluation data

Trained and evaluated on mjbeattie/finditfixit

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 2e-05
  • train_batch_size: 16
  • eval_batch_size: 16
  • seed: 42
  • optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
  • lr_scheduler_type: linear
  • num_epochs: 4

Training results

Training Loss Epoch Step Validation Loss Accuracy
0.6326 1.0 2975 0.6031 0.7961
0.4962 2.0 5950 0.5833 0.8029
0.4335 3.0 8925 0.6113 0.8014
0.3552 4.0 11900 0.6323 0.7987

Framework versions

  • Transformers 4.38.2
  • Pytorch 2.2.1+cu121
  • Datasets 2.18.0
  • Tokenizers 0.15.2