vahn9995
/

longt5-stable-diffusion-prompt

Text2Text Generation

Generated from Trainer

Inference Endpoints

Model card Files Files and versions Community

longt5-stable-diffusion-prompt / README.md

vahn9995's picture

Upload 14 files

68d75c2 7 months ago

|

raw history blame

No virus

3.52 kB

	---
	license: bsd-3-clause
	base_model: pszemraj/long-t5-tglobal-base-16384-book-summary
	tags:
	- generated_from_trainer
	model-index:
	- name: output
	results: []
	---


	# Model description

	This model is a fine-tuned version of [pszemraj/long-t5-tglobal-base-16384-book-summary](https://huggingface.co/pszemraj/long-t5-tglobal-base-16384-book-summary) on a custom sample-size dataset.
	The dataset was [kmfoda/booksum](https://huggingface.co/datasets/kmfoda/booksum) fed into GPT3.5-turbo with a finely tuned prompt to output high quality Stable Diffusion prompts.
	The small dataset (less than $10 of OpenAI credits) was roughly 15k entries as a proof of concept.

	The goal for this model concept was to create a text summarization model that creates decent Stable Diffusion prompts comparable to a human or high-end LLM like GPT-4.

	Example generations from an excerpt of Hemingway:

	```
	this model: village in late summer, river and plain, mountains, pebbled boulders, blue water, troops marching, dusty trees, soldiers marching along road, crops rich with fruit trees, battle in the mountains, artillery flashes, cool nights, highly detailed, dramatic lighting

	gpt-4: desert landscape with camel caravan at sunset, nomad tents, sand dunes, oasis, traditional clothing, dramatic lighting, 8k UHD, highly detailed, masterpiece, digital painting, global illumination
	```

	This is a VERY rough proof-of-concept model that could be greatly improved by a higher quality dataset and possibly different hyperparameters.

	## Training procedure

	Training was completed over 7 epochs with a modified version of the run_summarization.py Huggingface training script.

	### Training hyperparameters

	The following hyperparameters were used during training:
	- learning_rate: 0.0002
	- train_batch_size: 4
	- eval_batch_size: 4
	- seed: 42
	- distributed_type: multi-GPU
	- num_devices: 2
	- gradient_accumulation_steps: 6
	- total_train_batch_size: 48
	- total_eval_batch_size: 8
	- optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
	- lr_scheduler_type: linear
	- num_epochs: 7.0

	### Training results

	\| Training Loss \| Epoch \| Step \| Validation Loss \|
	\|:-------------:\|:-----:\|:----:\|:---------------:\|
	\| 2.453 \| 0.28 \| 30 \| 2.0444 \|
	\| 2.2692 \| 0.56 \| 60 \| 1.8970 \|
	\| 2.1485 \| 0.84 \| 90 \| 1.8373 \|
	\| 2.0469 \| 1.12 \| 120 \| 1.8033 \|
	\| 1.9954 \| 1.4 \| 150 \| 1.7762 \|
	\| 1.9778 \| 1.68 \| 180 \| 1.7593 \|
	\| 1.9536 \| 1.96 \| 210 \| 1.7472 \|
	\| 1.8524 \| 2.24 \| 240 \| 1.7306 \|
	\| 1.8438 \| 2.52 \| 270 \| 1.7255 \|
	\| 1.8436 \| 2.8 \| 300 \| 1.7140 \|
	\| 1.7765 \| 3.08 \| 330 \| 1.7049 \|
	\| 1.7537 \| 3.36 \| 360 \| 1.7057 \|
	\| 1.7328 \| 3.64 \| 390 \| 1.6977 \|
	\| 1.723 \| 3.92 \| 420 \| 1.6973 \|
	\| 1.6592 \| 4.2 \| 450 \| 1.7058 \|
	\| 1.6563 \| 4.48 \| 480 \| 1.7034 \|
	\| 1.6443 \| 4.76 \| 510 \| 1.6969 \|
	\| 1.5782 \| 5.04 \| 540 \| 1.6953 \|
	\| 1.509 \| 5.32 \| 570 \| 1.7136 \|
	\| 1.5516 \| 5.6 \| 600 \| 1.7064 \|
	\| 1.558 \| 5.88 \| 630 \| 1.7045 \|
	\| 1.5016 \| 6.16 \| 660 \| 1.7182 \|
	\| 1.5288 \| 6.44 \| 690 \| 1.7111 \|
	\| 1.4665 \| 6.72 \| 720 \| 1.7030 \|


	### Framework versions

	- Transformers 4.36.0.dev0
	- Pytorch 2.1.1+cu118
	- Datasets 2.15.0
	- Tokenizers 0.15.0