--- license: mit datasets: - deepghs/monochrome_danbooru metrics: - accuracy pipeline_tag: image-classification tags: - art --- The models used for determining whether an anime image is monochrome have a training size of 384. | Model | FLOPs | Accuracy | Confusion Matrix | Description | |:--------------------------------:|:------:|:--------:|:----------------------------------------------------------------------------------------------------------------------------------:|--------------------------------------------------------------------------------------------------------------------------------------------------------| | caformer_s36 | 22.10G | 95.63% | [Confusion Matrix](https://huggingface.co/deepghs/monochrome_detect/blob/main/caformer_s36/plot_confusion.png) | Model: caformer_s36 from timm | | caformer_s36_safe2 | 22.10G | 95.52% | [Confusion Matrix](https://huggingface.co/deepghs/monochrome_detect/blob/main/caformer_s36_safe2/plot_confusion.png) | Model: caformer_s36 from timm, which have better precision and lower recall than caformer_s36 | | caformer_s36_plus | 22.10G | 97.31% | [Confusion Matrix](https://huggingface.co/deepghs/monochrome_detect/blob/main/caformer_s36_plus/plot_confusion.png) | Model: caformer_s36.sail_in22k_ft_in1k_384 pratrained from timm | | caformer_s36_plus_safe2 | 22.10G | 97.09% | [Confusion Matrix](https://huggingface.co/deepghs/monochrome_detect/blob/main/caformer_s36_plus_safe2/plot_confusion.png) | Model: caformer_s36.sail_in22k_ft_in1k_384 pratrained from timm, which have better precision and lower recall than caformer_s36.sail_in22k_ft_in1k_384 | | mobilenetv3_large_100 | 0.63G | 95.40% | [Confusion Matrix](https://huggingface.co/deepghs/monochrome_detect/blob/main/mobilenetv3_large_100/plot_confusion.png) | Model: mobilenetv3_large_100 from timm | | mobilenetv3_large_100_dist | 0.63G | 96.30% | [Confusion Matrix](https://huggingface.co/deepghs/monochrome_detect/blob/main/mobilenetv3_large_100_dist/plot_confusion.png) | Distillated from caformer_s36_plus, using mobilenetv3_large_100 | | mobilenetv3_large_100_safe2 | 0.63G | 94.62% | [Confusion Matrix](https://huggingface.co/deepghs/monochrome_detect/blob/main/mobilenetv3_large_100_safe2/plot_confusion.png) | Model: mobilenetv3_large_100 from timm, which have better precision and lower recall than mobilenetv3_large_100 | | mobilenetv3_large_100_dist_safe2 | 0.63G | 95.85% | [Confusion Matrix](https://huggingface.co/deepghs/monochrome_detect/blob/main/mobilenetv3_large_100_dist_safe2/plot_confusion.png) | Distillated from caformer_s36_plus_safe2, using mobilenetv3_large_100 |