|
--- |
|
license: mit |
|
language: |
|
- en |
|
base_model: |
|
- fancyfeast/llama-joycaption-alpha-two-hf-llava |
|
--- |
|
# LLM Caption |
|
|
|
This Python CLI script generates caption files for all images within a specified folder. It saves the captions using the same filename as the corresponding image, with a .txt extension, either in the same folder or in the directory specified by the output_dir argument. The script will not create captions for images that already have a corresponding caption file in the output_dir. |
|
|
|
This project is not original but an adaptation from several other projects from https://huggingface.co/fancyfeast , https://huggingface.co/John6666 and https://huggingface.co/Wi-zz |
|
|
|
## Installation |
|
|
|
```bash |
|
python3 -m venv ./venv |
|
source venv/bin/activate |
|
pip install -r requirements.txt |
|
``` |
|
|
|
## Dependencies |
|
|
|
* Google SigLIP (3.5GB) will be downloaded automatically from <https://huggingface.co/google/siglip-so400m-patch14-384> |
|
* Uncensored LEXI LAMA Llama-3.1-8b-Instruct (5.5GB) will be downloaded automatically from <https://huggingface.co/John6666/Llama-3.1-8B-Lexi-Uncensored-V2-nf4> |
|
* The Joy Caption model is on the checkpoint folder |
|
|
|
## Usage |
|
|
|
```bash |
|
#EX1 |
|
python3 ./caption.py ./test |
|
|
|
#EX2 |
|
python3 ./caption.py ./test \ |
|
--prompt "Describe this image in detail within 50 words." \ |
|
--output_dir /tmp/caption |
|
|
|
``` |
|
|
|
## Default prompt |
|
|
|
In one paragraph, write a very descriptive caption for this image, describe all objects, characters and their actions, describe in detail what is happening and their emotions. Include information about lighting, the style of this image and information about camera angle within 200 words. Don't create any title for the image. |