microsoft-swinv2-small-patch4-window16-256-finetuned-xblockm
This model is a fine-tuned version of microsoft/swinv2-small-patch4-window16-256 on the howdyaendra/xblock-social-screenshots dataset. It achieves the following results on the evaluation set:
- Loss: 0.1252
- Roc Auc: 0.9535
Model description
This model is trained on several thousand screenshots reported to the XBlock 3rd-party Bluesky labeller service. It is intended to be used to label Bluesky posts that have screenshots from social media sites embedded in them. Please also see aendra-rininsland/xblock.
Intended uses & limitations
Screenshot moderation
Training and evaluation data
20% split of 1618 images
Training procedure
See notebook.
Training hyperparameters
The following hyperparameters were used during training:
- learning_rate: 5e-05
- train_batch_size: 16
- eval_batch_size: 16
- seed: 42
- gradient_accumulation_steps: 4
- total_train_batch_size: 64
- optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
- lr_scheduler_type: linear
- lr_scheduler_warmup_ratio: 0.1
- num_epochs: 8
Training results
Training Loss | Epoch | Step | Validation Loss | Roc Auc |
---|---|---|---|---|
0.4357 | 0.9877 | 20 | 0.2544 | 0.7784 |
0.2027 | 1.9753 | 40 | 0.2016 | 0.8431 |
0.1743 | 2.9630 | 60 | 0.1701 | 0.8912 |
0.1625 | 4.0 | 81 | 0.1677 | 0.9083 |
0.1321 | 4.9877 | 101 | 0.1447 | 0.9246 |
0.1155 | 5.9753 | 121 | 0.1418 | 0.9311 |
0.0959 | 6.9630 | 141 | 0.1381 | 0.9460 |
0.0788 | 7.9012 | 160 | 0.1252 | 0.9535 |
Framework versions
- Transformers 4.44.1
- Pytorch 2.2.2
- Datasets 3.0.1
- Tokenizers 0.19.1
- Downloads last month
- 71
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social
visibility and check back later, or deploy to Inference Endpoints (dedicated)
instead.
Model tree for howdyaendra/microsoft-swinv2-small-patch4-window16-256-finetuned-xblockm
Base model
microsoft/swinv2-small-patch4-window16-256