k4d3 commited on
Commit
815dbdb
·
1 Parent(s): e89ba19

update readme

Browse files
Files changed (1) hide show
  1. README.md +113 -13
README.md CHANGED
@@ -14,9 +14,9 @@ A collection of specialized scripts for AI image processing, dataset preparation
14
 
15
  ---
16
 
17
- ### WDV3 (Waifu Diffusion V3 Tagger)
18
 
19
- An image tagging script using the WD V3 tagger models. Supports multiple model architectures (ViT, SwinV2, ConvNext) and can process both single images and directories recursively.
20
 
21
  #### Features
22
 
@@ -26,7 +26,7 @@ An image tagging script using the WD V3 tagger models. Supports multiple model a
26
  - CUDA acceleration with FP16 support
27
  - JXL image format support
28
 
29
- ### Training Functions (train_functions.zsh)
30
 
31
  A set of ZSH functions for managing AI model training workflows:
32
 
@@ -36,7 +36,7 @@ A set of ZSH functions for managing AI model training workflows:
36
  - Output directory management
37
  - Automatic cleanup of empty outputs
38
 
39
- ### Git Wrapper (git-wrapper.zsh)
40
 
41
  Enhanced Git functionality for dataset management:
42
 
@@ -44,7 +44,7 @@ Enhanced Git functionality for dataset management:
44
  - LFS integration for JXL files
45
  - Dataset-specific Git attributes management
46
 
47
- ### Check4sig (check4sig.zsh)
48
 
49
  Dataset caption file watermark detection utility:
50
 
@@ -53,7 +53,7 @@ Dataset caption file watermark detection utility:
53
  - Interactive editing with nvim
54
  - Recursive directory scanning
55
 
56
- ### Gallery-dl Wrapper (gallery-dl.zsh)
57
 
58
  Directory-aware wrapper for gallery-dl:
59
 
@@ -61,16 +61,16 @@ Directory-aware wrapper for gallery-dl:
61
  - Maintains consistent download locations
62
  - Preserves original command functionality
63
 
64
- ### JoyCaption (joy)
65
 
66
- Advanced image captioning system using CLIP and LLM:
67
 
68
  - Multiple caption styles (descriptive, training prompts, art critic, etc.)
69
  - Custom image adapters
70
  - Tag-based caption generation
71
  - Batch processing support
72
 
73
- ### PNG to MP4 Converter (png2mp4)
74
 
75
  Training progress visualization tool:
76
 
@@ -79,7 +79,7 @@ Training progress visualization tool:
79
  - Step counter overlay support
80
  - Multiple sample handling
81
 
82
- ### XY Plot Generator (xyplot)
83
 
84
  Image comparison grid generator:
85
 
@@ -88,7 +88,7 @@ Image comparison grid generator:
88
  - Optional row/column labels
89
  - Automatic image padding and alignment
90
 
91
- ### Caption Concatenator (concat_captions)
92
 
93
  Utility for combining multiple caption files:
94
 
@@ -97,6 +97,100 @@ Utility for combining multiple caption files:
97
  - Batch processing support
98
  - Error handling for missing files
99
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
100
  <!-- ⚠️ TODO: add more scripts -->
101
 
102
  ## 🚀 Installation
@@ -126,7 +220,7 @@ nano ~/.zshrc
126
 
127
  ---
128
 
129
- - miniconda with the environment set up for training with sd-scripts, timm, etc
130
  - ZSH shell (optional)
131
  - CUDA-capable GPU (recommended)
132
  - Required Python packages:
@@ -142,10 +236,16 @@ nano ~/.zshrc
142
 
143
  ---
144
 
145
- Each script can be used independently or as part of a workflow. Here are some common usage examples:
146
 
147
  <!-- ⚠️ TODO: add more usage examples -->
148
 
 
 
 
 
 
 
149
  ### JoyCaption
150
 
151
  ```bash
 
14
 
15
  ---
16
 
17
+ ### `wdv3`
18
 
19
+ An image tagging script using the WD V3 tagger models by [SmilingWolf](https://huggingface.co/SmilingWolf) based on [this repo](https://github.com/neggles/wdv3-timm). Supports multiple model architectures (ViT, SwinV2, ConvNext) and can process both single images and directories recursively.
20
 
21
  #### Features
22
 
 
26
  - CUDA acceleration with FP16 support
27
  - JXL image format support
28
 
29
+ ### `train_functions`
30
 
31
  A set of ZSH functions for managing AI model training workflows:
32
 
 
36
  - Output directory management
37
  - Automatic cleanup of empty outputs
38
 
39
+ ### `git-wrapper`
40
 
41
  Enhanced Git functionality for dataset management:
42
 
 
44
  - LFS integration for JXL files
45
  - Dataset-specific Git attributes management
46
 
47
+ ### `check4sig`
48
 
49
  Dataset caption file watermark detection utility:
50
 
 
53
  - Interactive editing with nvim
54
  - Recursive directory scanning
55
 
56
+ ### `gallery-dl`
57
 
58
  Directory-aware wrapper for gallery-dl:
59
 
 
61
  - Maintains consistent download locations
62
  - Preserves original command functionality
63
 
64
+ ### `joy`
65
 
66
+ Advanced image captioning system by [fancyfeast](https://huggingface.co/fancyfeast) called [JoyCaption](https://huggingface.co/spaces/fancyfeast/joy-caption-alpha-two/tree/main) using CLIP and LLM
67
 
68
  - Multiple caption styles (descriptive, training prompts, art critic, etc.)
69
  - Custom image adapters
70
  - Tag-based caption generation
71
  - Batch processing support
72
 
73
+ ### `png2mp4`
74
 
75
  Training progress visualization tool:
76
 
 
79
  - Step counter overlay support
80
  - Multiple sample handling
81
 
82
+ ### `xyplot`
83
 
84
  Image comparison grid generator:
85
 
 
88
  - Optional row/column labels
89
  - Automatic image padding and alignment
90
 
91
+ ### `concat_captions`
92
 
93
  Utility for combining multiple caption files:
94
 
 
97
  - Batch processing support
98
  - Error handling for missing files
99
 
100
+ ### `stats`
101
+
102
+ Directory analysis and statistics generation tool that provides detailed file counts and metrics:
103
+
104
+ - Detailed file counting by extension with color-coded output for different file types (JXL, PNG, JPG, etc.)
105
+ - Multiple sorting options (by name, count, or specific file types)
106
+ - Recursive directory scanning with aggregated statistics
107
+ - Color-coded thresholds for dataset size evaluation
108
+ - Automatic categorization of files into image and text groups
109
+ - Grand total calculations across all subdirectories
110
+
111
+ ### `shortcode`
112
+
113
+ Hugo-compatible shortcode generator for image galleries with blurhash integration:
114
+
115
+ - Generates Hugo-compatible shortcode blocks for each image
116
+ - Integrates blurhash codes for progressive image loading
117
+ - Automatically extracts and includes image dimensions
118
+ - Preserves and integrates image captions from metadata
119
+ - Supports grid layout configurations
120
+ - Processes directories recursively while maintaining structure
121
+ - Handles relative path resolution for static content
122
+
123
+ ### `yiffdata`
124
+
125
+ Comprehensive image metadata extraction and JSON generation utility:
126
+
127
+ - Extracts precise image dimensions using PIL
128
+ - Combines existing blurhash codes from .bh files
129
+ - Integrates caption data from .caption files
130
+ - Generates consolidated JSON output with all metadata
131
+ - Maintains original filename references
132
+ - Supports batch processing of entire directories
133
+ - Preserves file relationships and metadata hierarchy
134
+
135
+ ### `txt2tags`
136
+
137
+ Batch file extension conversion utility for dataset management:
138
+
139
+ - Converts .txt files to .tags format for ML training compatibility
140
+ - Preserves original file content and structure
141
+ - Supports recursive directory traversal
142
+ - Interactive mode for selective conversion
143
+ - Maintains original file timestamps and permissions
144
+ - Simple command-line interface with directory input
145
+
146
+ ### `txt2emoji`
147
+
148
+ Advanced text-to-emoji conversion system with context awareness:
149
+
150
+ - Sophisticated word-to-emoji mapping with custom dictionaries
151
+ - Context-aware emoji selection to avoid redundancy
152
+ - Detailed conversion explanations with rationale
153
+ - Batch processing with multiple output formats
154
+ - Configurable threshold and filtering options
155
+ - NLTK integration for improved text parsing
156
+ - Extensive customization options for emoji mappings
157
+
158
+ ### `jtp2`
159
+
160
+ State-of-the-art image classification system using [Redrocket](https://huggingface.co/RedRocket)'s [PILOT2](https://huggingface.co/RedRocket/JointTaggerProject/tree/main/JTP_PILOT2) model:
161
+
162
+ - Implements Vision Transformer architecture with custom modifications
163
+ - Features GatedHead classifier for improved accuracy
164
+ - CUDA-accelerated inference with FP16 support
165
+ - Configurable confidence thresholds for tag generation
166
+ - Comprehensive batch processing capabilities
167
+ - Automatic tag file generation alongside images
168
+ - Supports multiple image formats including JXL
169
+
170
+ ### `keyframe`
171
+
172
+ Efficient video keyframe extraction tool using FFmpeg:
173
+
174
+ - Extracts high-quality keyframes from video files
175
+ - Creates organized output directories automatically
176
+ - Maintains original frame quality and metadata
177
+ - Intelligent I-frame detection and extraction
178
+ - Sequential frame naming with padding
179
+ - Minimal quality loss during extraction
180
+ - Simple command-line interface
181
+
182
+ ### `chop_blocks`
183
+
184
+ Advanced LoRA model manipulation tool for fine-grained control using code from [resize-lora](https://github.com/elias-gaeros/resize_lora) by [Gaeros](https://github.com/elias-gaeros):
185
+
186
+ - Precise block-level filtering of LoRA models
187
+ - Sophisticated weight adjustment capabilities
188
+ - Full SafeTensors format support
189
+ - Detailed analysis and reporting of model structure
190
+ - Preserves model metadata during modifications
191
+ - Vector string format for block manipulation
192
+ - Supports both SDXL and SD1 naming conventions
193
+
194
  <!-- ⚠️ TODO: add more scripts -->
195
 
196
  ## 🚀 Installation
 
220
 
221
  ---
222
 
223
+ - miniconda with the environment set up for training with sd-scripts, inferring with timm, llama, etc
224
  - ZSH shell (optional)
225
  - CUDA-capable GPU (recommended)
226
  - Required Python packages:
 
236
 
237
  ---
238
 
239
+ Each script can be used independently or as part of a workflow. Here are some usage examples:
240
 
241
  <!-- ⚠️ TODO: add more usage examples -->
242
 
243
+ ### XY Plot
244
+
245
+ ```bash
246
+ xyplot ./ComfyUI_00341_.png ./ComfyUI_00342_.png ./ComfyUI_00346_.png --column-labels "No LoRA" "minit-v1s6000.safetensors M:1.0 TE:1.0" "minit-v1s6000.safetensors M:1.40 TE:1.0" --rows 1 --output plot1.png
247
+ ```
248
+
249
  ### JoyCaption
250
 
251
  ```bash