fifi_classification
First load: April 13, 2024
University of Oklahoma
The city of Seattle uses a app called FindIt-FixIt to gather service requests from residents. The requests routed to the responsible agency for resolution. In 2023, we obtained the detail data from 2018-2023 in an effort to understand how COVID affected city services. This data includes, among other things, detailed text from residents. It also includes the service request type as chosen by the resident. Text details and their corresponding categories are included in the dataset mjbeattie/finditfixit.
This dataset was used to fine-tune distilbert/distilbert-base-uncased to classify text into one of the application's 15 service request types. This model can be used to classify unseen texts.
The model achieves the following results on the evaluation set:
- Loss: 0.6323
- Accuracy: 0.7987
Model description
Classifies text into the 15 Seattle service request types.
Intended uses & limitations
Used for reclassifying service requests made prior to the introduction of the SPD-Unauthorized Encampment type.
Training and evaluation data
Trained and evaluated on mjbeattie/finditfixit
Training procedure
Training hyperparameters
The following hyperparameters were used during training:
- learning_rate: 2e-05
- train_batch_size: 16
- eval_batch_size: 16
- seed: 42
- optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
- lr_scheduler_type: linear
- num_epochs: 4
Training results
Training Loss | Epoch | Step | Validation Loss | Accuracy |
---|---|---|---|---|
0.6326 | 1.0 | 2975 | 0.6031 | 0.7961 |
0.4962 | 2.0 | 5950 | 0.5833 | 0.8029 |
0.4335 | 3.0 | 8925 | 0.6113 | 0.8014 |
0.3552 | 4.0 | 11900 | 0.6323 | 0.7987 |
Framework versions
- Transformers 4.38.2
- Pytorch 2.2.1+cu121
- Datasets 2.18.0
- Tokenizers 0.15.2
- Downloads last month
- 80