|
--- |
|
license: mit |
|
datasets: |
|
- deepghs/chafen_arknights |
|
- deepghs/monochrome_danbooru |
|
metrics: |
|
- accuracy |
|
--- |
|
|
|
# imgutils-models |
|
|
|
This repository includes all the models in [deepghs/imgutils](https://github.com/deepghs/imgutils). |
|
|
|
## LPIPS |
|
|
|
This model is used for clustering anime images (named `差分` in Chinese), based on [richzhang/PerceptualSimilarity](https://github.com/richzhang/PerceptualSimilarity), trained with dataset [deepghs/chafen_arknights(private)](https://huggingface.co/datasets/deepghs/chafen_arknights). |
|
|
|
When threshold is `0.45`, the [adjusted rand score](https://scikit-learn.org/stable/modules/generated/sklearn.metrics.adjusted_rand_score.html) can reach `0.995`. |
|
|
|
File lists: |
|
* `lpips_diff.onnx`, feature difference. |
|
* `lpips_feature.onnx`, feature extracting. |
|
|
|
## Monochrome |
|
|
|
These model is used for monochrome image classification, based on CNNs and Transformers, trained with dataset [deepghs/monochrome_danbooru(private)](https://huggingface.co/datasets/deepghs/monochrome_danbooru). |
|
|
|
The following are the checkpoints that have been formally put into use, all based on the Caformer architecture: |
|
|
|
| Checkpoint | Algorithm | Safe Level | Accuracy | False Negative | False Positive | |
|
|:----------------------------:|:---------:|:----------:|:----------:|:--------------:|:--------------:| |
|
| monochrome-caformer-40 | caformer | 0 | 96.41% | 2.69% | 0.89% | |
|
| **monochrome-caformer-110** | caformer | 0 | **96.97%** | 1.57% | 1.46% | |
|
| monochrome-caformer_safe2-80 | caformer | 2 | 94.84% | **1.12%** | 4.03% | |
|
| monochrome-caformer_safe4-70 | caformer | 4 | 94.28% | **0.67%** | 5.04% | |
|
|
|
**`monochrome-caformer-110` has the best overall accuracy** among them, but considering that this model is often used to screen out monochrome images |
|
and we want to screen out as many as possible without omission, we have also introduced weighted models (`safe2` and `safe4`). |
|
Although their overall accuracy has been slightly reduced, the probability of False Negative (misidentifying a monochrome image as a colored one) is lower, |
|
making them more suitable for batch screening. |
|
|
|
## Deepdanbooru |
|
|
|
`deepdanbooru` is a model used to tag anime images. Here, we provide a table for tag classification called `deepdanbooru_tags.csv`, |
|
as well as an ONNX model (from [chinoll/deepdanbooru](https://huggingface.co/spaces/SmilingWolf/wd-v1-4-tags)). |
|
|
|
It's worth noting that due to the poor quality of the deepdanbooru model itself and the relatively old dataset, |
|
it is only for testing purposes and is not recommended to be used as the main classification model. We recommend using the `wd14` model instead, see: |
|
|
|
* https://huggingface.co/spaces/SmilingWolf/wd-v1-4-tags |
|
|
|
|