update readme
Browse files
README.md
CHANGED
@@ -14,9 +14,9 @@ A collection of specialized scripts for AI image processing, dataset preparation
|
|
14 |
|
15 |
---
|
16 |
|
17 |
-
###
|
18 |
|
19 |
-
An image tagging script using the WD V3 tagger models. Supports multiple model architectures (ViT, SwinV2, ConvNext) and can process both single images and directories recursively.
|
20 |
|
21 |
#### Features
|
22 |
|
@@ -26,7 +26,7 @@ An image tagging script using the WD V3 tagger models. Supports multiple model a
|
|
26 |
- CUDA acceleration with FP16 support
|
27 |
- JXL image format support
|
28 |
|
29 |
-
###
|
30 |
|
31 |
A set of ZSH functions for managing AI model training workflows:
|
32 |
|
@@ -36,7 +36,7 @@ A set of ZSH functions for managing AI model training workflows:
|
|
36 |
- Output directory management
|
37 |
- Automatic cleanup of empty outputs
|
38 |
|
39 |
-
###
|
40 |
|
41 |
Enhanced Git functionality for dataset management:
|
42 |
|
@@ -44,7 +44,7 @@ Enhanced Git functionality for dataset management:
|
|
44 |
- LFS integration for JXL files
|
45 |
- Dataset-specific Git attributes management
|
46 |
|
47 |
-
###
|
48 |
|
49 |
Dataset caption file watermark detection utility:
|
50 |
|
@@ -53,7 +53,7 @@ Dataset caption file watermark detection utility:
|
|
53 |
- Interactive editing with nvim
|
54 |
- Recursive directory scanning
|
55 |
|
56 |
-
###
|
57 |
|
58 |
Directory-aware wrapper for gallery-dl:
|
59 |
|
@@ -61,16 +61,16 @@ Directory-aware wrapper for gallery-dl:
|
|
61 |
- Maintains consistent download locations
|
62 |
- Preserves original command functionality
|
63 |
|
64 |
-
###
|
65 |
|
66 |
-
Advanced image captioning system using CLIP and LLM
|
67 |
|
68 |
- Multiple caption styles (descriptive, training prompts, art critic, etc.)
|
69 |
- Custom image adapters
|
70 |
- Tag-based caption generation
|
71 |
- Batch processing support
|
72 |
|
73 |
-
###
|
74 |
|
75 |
Training progress visualization tool:
|
76 |
|
@@ -79,7 +79,7 @@ Training progress visualization tool:
|
|
79 |
- Step counter overlay support
|
80 |
- Multiple sample handling
|
81 |
|
82 |
-
###
|
83 |
|
84 |
Image comparison grid generator:
|
85 |
|
@@ -88,7 +88,7 @@ Image comparison grid generator:
|
|
88 |
- Optional row/column labels
|
89 |
- Automatic image padding and alignment
|
90 |
|
91 |
-
###
|
92 |
|
93 |
Utility for combining multiple caption files:
|
94 |
|
@@ -97,6 +97,100 @@ Utility for combining multiple caption files:
|
|
97 |
- Batch processing support
|
98 |
- Error handling for missing files
|
99 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
100 |
<!-- ⚠️ TODO: add more scripts -->
|
101 |
|
102 |
## 🚀 Installation
|
@@ -126,7 +220,7 @@ nano ~/.zshrc
|
|
126 |
|
127 |
---
|
128 |
|
129 |
-
- miniconda with the environment set up for training with sd-scripts, timm, etc
|
130 |
- ZSH shell (optional)
|
131 |
- CUDA-capable GPU (recommended)
|
132 |
- Required Python packages:
|
@@ -142,10 +236,16 @@ nano ~/.zshrc
|
|
142 |
|
143 |
---
|
144 |
|
145 |
-
Each script can be used independently or as part of a workflow. Here are some
|
146 |
|
147 |
<!-- ⚠️ TODO: add more usage examples -->
|
148 |
|
|
|
|
|
|
|
|
|
|
|
|
|
149 |
### JoyCaption
|
150 |
|
151 |
```bash
|
|
|
14 |
|
15 |
---
|
16 |
|
17 |
+
### `wdv3`
|
18 |
|
19 |
+
An image tagging script using the WD V3 tagger models by [SmilingWolf](https://huggingface.co/SmilingWolf) based on [this repo](https://github.com/neggles/wdv3-timm). Supports multiple model architectures (ViT, SwinV2, ConvNext) and can process both single images and directories recursively.
|
20 |
|
21 |
#### Features
|
22 |
|
|
|
26 |
- CUDA acceleration with FP16 support
|
27 |
- JXL image format support
|
28 |
|
29 |
+
### `train_functions`
|
30 |
|
31 |
A set of ZSH functions for managing AI model training workflows:
|
32 |
|
|
|
36 |
- Output directory management
|
37 |
- Automatic cleanup of empty outputs
|
38 |
|
39 |
+
### `git-wrapper`
|
40 |
|
41 |
Enhanced Git functionality for dataset management:
|
42 |
|
|
|
44 |
- LFS integration for JXL files
|
45 |
- Dataset-specific Git attributes management
|
46 |
|
47 |
+
### `check4sig`
|
48 |
|
49 |
Dataset caption file watermark detection utility:
|
50 |
|
|
|
53 |
- Interactive editing with nvim
|
54 |
- Recursive directory scanning
|
55 |
|
56 |
+
### `gallery-dl`
|
57 |
|
58 |
Directory-aware wrapper for gallery-dl:
|
59 |
|
|
|
61 |
- Maintains consistent download locations
|
62 |
- Preserves original command functionality
|
63 |
|
64 |
+
### `joy`
|
65 |
|
66 |
+
Advanced image captioning system by [fancyfeast](https://huggingface.co/fancyfeast) called [JoyCaption](https://huggingface.co/spaces/fancyfeast/joy-caption-alpha-two/tree/main) using CLIP and LLM
|
67 |
|
68 |
- Multiple caption styles (descriptive, training prompts, art critic, etc.)
|
69 |
- Custom image adapters
|
70 |
- Tag-based caption generation
|
71 |
- Batch processing support
|
72 |
|
73 |
+
### `png2mp4`
|
74 |
|
75 |
Training progress visualization tool:
|
76 |
|
|
|
79 |
- Step counter overlay support
|
80 |
- Multiple sample handling
|
81 |
|
82 |
+
### `xyplot`
|
83 |
|
84 |
Image comparison grid generator:
|
85 |
|
|
|
88 |
- Optional row/column labels
|
89 |
- Automatic image padding and alignment
|
90 |
|
91 |
+
### `concat_captions`
|
92 |
|
93 |
Utility for combining multiple caption files:
|
94 |
|
|
|
97 |
- Batch processing support
|
98 |
- Error handling for missing files
|
99 |
|
100 |
+
### `stats`
|
101 |
+
|
102 |
+
Directory analysis and statistics generation tool that provides detailed file counts and metrics:
|
103 |
+
|
104 |
+
- Detailed file counting by extension with color-coded output for different file types (JXL, PNG, JPG, etc.)
|
105 |
+
- Multiple sorting options (by name, count, or specific file types)
|
106 |
+
- Recursive directory scanning with aggregated statistics
|
107 |
+
- Color-coded thresholds for dataset size evaluation
|
108 |
+
- Automatic categorization of files into image and text groups
|
109 |
+
- Grand total calculations across all subdirectories
|
110 |
+
|
111 |
+
### `shortcode`
|
112 |
+
|
113 |
+
Hugo-compatible shortcode generator for image galleries with blurhash integration:
|
114 |
+
|
115 |
+
- Generates Hugo-compatible shortcode blocks for each image
|
116 |
+
- Integrates blurhash codes for progressive image loading
|
117 |
+
- Automatically extracts and includes image dimensions
|
118 |
+
- Preserves and integrates image captions from metadata
|
119 |
+
- Supports grid layout configurations
|
120 |
+
- Processes directories recursively while maintaining structure
|
121 |
+
- Handles relative path resolution for static content
|
122 |
+
|
123 |
+
### `yiffdata`
|
124 |
+
|
125 |
+
Comprehensive image metadata extraction and JSON generation utility:
|
126 |
+
|
127 |
+
- Extracts precise image dimensions using PIL
|
128 |
+
- Combines existing blurhash codes from .bh files
|
129 |
+
- Integrates caption data from .caption files
|
130 |
+
- Generates consolidated JSON output with all metadata
|
131 |
+
- Maintains original filename references
|
132 |
+
- Supports batch processing of entire directories
|
133 |
+
- Preserves file relationships and metadata hierarchy
|
134 |
+
|
135 |
+
### `txt2tags`
|
136 |
+
|
137 |
+
Batch file extension conversion utility for dataset management:
|
138 |
+
|
139 |
+
- Converts .txt files to .tags format for ML training compatibility
|
140 |
+
- Preserves original file content and structure
|
141 |
+
- Supports recursive directory traversal
|
142 |
+
- Interactive mode for selective conversion
|
143 |
+
- Maintains original file timestamps and permissions
|
144 |
+
- Simple command-line interface with directory input
|
145 |
+
|
146 |
+
### `txt2emoji`
|
147 |
+
|
148 |
+
Advanced text-to-emoji conversion system with context awareness:
|
149 |
+
|
150 |
+
- Sophisticated word-to-emoji mapping with custom dictionaries
|
151 |
+
- Context-aware emoji selection to avoid redundancy
|
152 |
+
- Detailed conversion explanations with rationale
|
153 |
+
- Batch processing with multiple output formats
|
154 |
+
- Configurable threshold and filtering options
|
155 |
+
- NLTK integration for improved text parsing
|
156 |
+
- Extensive customization options for emoji mappings
|
157 |
+
|
158 |
+
### `jtp2`
|
159 |
+
|
160 |
+
State-of-the-art image classification system using [Redrocket](https://huggingface.co/RedRocket)'s [PILOT2](https://huggingface.co/RedRocket/JointTaggerProject/tree/main/JTP_PILOT2) model:
|
161 |
+
|
162 |
+
- Implements Vision Transformer architecture with custom modifications
|
163 |
+
- Features GatedHead classifier for improved accuracy
|
164 |
+
- CUDA-accelerated inference with FP16 support
|
165 |
+
- Configurable confidence thresholds for tag generation
|
166 |
+
- Comprehensive batch processing capabilities
|
167 |
+
- Automatic tag file generation alongside images
|
168 |
+
- Supports multiple image formats including JXL
|
169 |
+
|
170 |
+
### `keyframe`
|
171 |
+
|
172 |
+
Efficient video keyframe extraction tool using FFmpeg:
|
173 |
+
|
174 |
+
- Extracts high-quality keyframes from video files
|
175 |
+
- Creates organized output directories automatically
|
176 |
+
- Maintains original frame quality and metadata
|
177 |
+
- Intelligent I-frame detection and extraction
|
178 |
+
- Sequential frame naming with padding
|
179 |
+
- Minimal quality loss during extraction
|
180 |
+
- Simple command-line interface
|
181 |
+
|
182 |
+
### `chop_blocks`
|
183 |
+
|
184 |
+
Advanced LoRA model manipulation tool for fine-grained control using code from [resize-lora](https://github.com/elias-gaeros/resize_lora) by [Gaeros](https://github.com/elias-gaeros):
|
185 |
+
|
186 |
+
- Precise block-level filtering of LoRA models
|
187 |
+
- Sophisticated weight adjustment capabilities
|
188 |
+
- Full SafeTensors format support
|
189 |
+
- Detailed analysis and reporting of model structure
|
190 |
+
- Preserves model metadata during modifications
|
191 |
+
- Vector string format for block manipulation
|
192 |
+
- Supports both SDXL and SD1 naming conventions
|
193 |
+
|
194 |
<!-- ⚠️ TODO: add more scripts -->
|
195 |
|
196 |
## 🚀 Installation
|
|
|
220 |
|
221 |
---
|
222 |
|
223 |
+
- miniconda with the environment set up for training with sd-scripts, inferring with timm, llama, etc
|
224 |
- ZSH shell (optional)
|
225 |
- CUDA-capable GPU (recommended)
|
226 |
- Required Python packages:
|
|
|
236 |
|
237 |
---
|
238 |
|
239 |
+
Each script can be used independently or as part of a workflow. Here are some usage examples:
|
240 |
|
241 |
<!-- ⚠️ TODO: add more usage examples -->
|
242 |
|
243 |
+
### XY Plot
|
244 |
+
|
245 |
+
```bash
|
246 |
+
xyplot ./ComfyUI_00341_.png ./ComfyUI_00342_.png ./ComfyUI_00346_.png --column-labels "No LoRA" "minit-v1s6000.safetensors M:1.0 TE:1.0" "minit-v1s6000.safetensors M:1.40 TE:1.0" --rows 1 --output plot1.png
|
247 |
+
```
|
248 |
+
|
249 |
### JoyCaption
|
250 |
|
251 |
```bash
|