Ketengan-Diffusion
commited on
Added Chart
Browse files
README.md
CHANGED
@@ -60,6 +60,10 @@ Image source: [Source1](https://danbooru.donmai.us/posts/3143351) [Source2](http
|
|
60 |
|
61 |
Our dataset is scored using Pretrained CLIP+MLP Aesthetic Scoring model by https://github.com/christophschuhmann/improved-aesthetic-predictor, and We made adjusment into our script to detecting any text or watermark by utilizing OCR by pytesseract
|
62 |
|
|
|
|
|
|
|
|
|
63 |
This scoring method has scale between -1-100, we take the score threshold around 17 or 20 as minimum and 50-75 as maximum to pretain the 2D style of the dataset, Any images with text will returning -1 score. So any images with score below 17 or above 65 is deleted
|
64 |
|
65 |
The dataset curation proccess is using Nvidia T4 16GB Machine and takes about 7 days for curating 1.000.000 images.
|
|
|
60 |
|
61 |
Our dataset is scored using Pretrained CLIP+MLP Aesthetic Scoring model by https://github.com/christophschuhmann/improved-aesthetic-predictor, and We made adjusment into our script to detecting any text or watermark by utilizing OCR by pytesseract
|
62 |
|
63 |
+
<p align="center">
|
64 |
+
<img src="Chart.png" width=70% height=70%>
|
65 |
+
</p>
|
66 |
+
|
67 |
This scoring method has scale between -1-100, we take the score threshold around 17 or 20 as minimum and 50-75 as maximum to pretain the 2D style of the dataset, Any images with text will returning -1 score. So any images with score below 17 or above 65 is deleted
|
68 |
|
69 |
The dataset curation proccess is using Nvidia T4 16GB Machine and takes about 7 days for curating 1.000.000 images.
|