ONNX
File size: 2,828 Bytes
0a3b2a8
 
923738b
 
 
 
 
0a3b2a8
923738b
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
3695050
 
4732389
 
 
 
 
 
3695050
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
---
license: mit
datasets:
- deepghs/chafen_arknights
- deepghs/monochrome_danbooru
metrics:
- accuracy
---

# imgutils-models

This repository includes all the models in [deepghs/imgutils](https://github.com/deepghs/imgutils).

## LPIPS

This model is used for clustering anime images (named `差分` in Chinese), based on [richzhang/PerceptualSimilarity](https://github.com/richzhang/PerceptualSimilarity), trained with dataset [deepghs/chafen_arknights(private)](https://huggingface.co/datasets/deepghs/chafen_arknights).

When threshold is `0.45`, the [adjusted rand score](https://scikit-learn.org/stable/modules/generated/sklearn.metrics.adjusted_rand_score.html) can reach `0.995`.

File lists:
* `lpips_diff.onnx`, feature difference.
* `lpips_feature.onnx`, feature extracting.

## Monochrome

These model is used for monochrome image classification, based on CNNs and Transformers, trained with dataset [deepghs/monochrome_danbooru(private)](https://huggingface.co/datasets/deepghs/monochrome_danbooru).

The following are the checkpoints that have been formally put into use, all based on the Caformer architecture:

|          Checkpoint          | Algorithm | Safe Level |  Accuracy  | False Negative | False Positive |
|:----------------------------:|:---------:|:----------:|:----------:|:--------------:|:--------------:|
|    monochrome-caformer-40    |  caformer |      0     |   96.41%   |      2.69%     |      0.89%     |
|  **monochrome-caformer-110** |  caformer |      0     | **96.97%** |      1.57%     |      1.46%     |
| monochrome-caformer_safe2-80 |  caformer |      2     |   94.84%   |    **1.12%**   |      4.03%     |
| monochrome-caformer_safe4-70 |  caformer |      4     |   94.28%   |    **0.67%**   |      5.04%     |

**`monochrome-caformer-110` has the best overall accuracy** among them, but considering that this model is often used to screen out monochrome images 
and we want to screen out as many as possible without omission, we have also introduced weighted models (`safe2` and `safe4`). 
Although their overall accuracy has been slightly reduced, the probability of False Negative (misidentifying a monochrome image as a colored one) is lower, 
making them more suitable for batch screening.

## Deepdanbooru

`deepdanbooru` is a model used to tag anime images. Here, we provide a table for tag classification called `deepdanbooru_tags.csv`, 
as well as an ONNX model (from [chinoll/deepdanbooru](https://huggingface.co/spaces/SmilingWolf/wd-v1-4-tags)).

It's worth noting that due to the poor quality of the deepdanbooru model itself and the relatively old dataset, 
it is only for testing purposes and is not recommended to be used as the main classification model. We recommend using the `wd14` model instead, see:

* https://huggingface.co/spaces/SmilingWolf/wd-v1-4-tags