--- license: mit datasets: - deepghs/anime_classification metrics: - accuracy pipeline_tag: image-classification tags: - art --- The model used to predict the types of anime images, which includes the following four categories: * 3D: Images rendered in 3D, including Mikumikudance, Koikatsu, etc. * Bangumi: Screenshots from anime videos. * Comic: Images of manga that contain a significant amount of text or panel sequences. * Illustration: General anime illustrations. | Model | FLOPs | Accuracy | Confusion Matrix | Description | |:-----------------:|:------:|:--------:|:----------------------------------------------------------------------------------------------------------------------:|----------------------------------------------------------------------| | caformer_s36 | 22.10G | 88.19% | [Confusion Matrix](https://huggingface.co/deepghs/anime_classification/blob/main/caformer_s36/plot_confusion.png) | Model: caformer_s36 from timm | | caformer_s36_plus | 22.10G | 93.47% | [Confusion Matrix](https://huggingface.co/deepghs/anime_classification/blob/main/caformer_s36_plus/plot_confusion.png) | Model: caformer_s36.sail_in22k_ft_in1k_384 pratrained from timm | | mobilenetv3 | 0.63G | 88.96% | [Confusion Matrix](https://huggingface.co/deepghs/anime_classification/blob/main/mobilenetv3/plot_confusion.png) | Model: mobilenetv3_large_100 from timm | | mobilenetv3_plus | 0.63G | 89.92% | [Confusion Matrix](https://huggingface.co/deepghs/anime_classification/blob/main/mobilenetv3_plus/plot_confusion.png) | Model: mobilenetv3_large_100 from timm, use SCELoss as loss function | | mobilevitv2_150 | 9.09G | 88.21% | [Confusion Matrix](https://huggingface.co/deepghs/anime_classification/blob/main/mobilevitv2_150/plot_confusion.png) | Model: mobilevitv2_150 from timm |