Model description

This model is a fine-tuned version of pszemraj/long-t5-tglobal-base-16384-book-summary on a custom sample-size dataset. The dataset was kmfoda/booksum fed into GPT3.5-turbo with a finely tuned prompt to output high quality Stable Diffusion prompts. The small dataset (less than $10 of OpenAI credits) was roughly 15k entries as a proof of concept.

The goal for this model concept was to create a text summarization model that creates decent Stable Diffusion prompts comparable to a human or high-end LLM like GPT-4.

Example generations from an excerpt of Hemingway:

this model: village in late summer, river and plain, mountains, pebbled boulders, blue water, troops marching, dusty trees, soldiers marching along road, crops rich with fruit trees, battle in the mountains, artillery flashes, cool nights, highly detailed, dramatic lighting

gpt-4: desert landscape with camel caravan at sunset, nomad tents, sand dunes, oasis, traditional clothing, dramatic lighting, 8k UHD, highly detailed, masterpiece, digital painting, global illumination

This is a VERY rough proof-of-concept model that could be greatly improved by a higher quality dataset and possibly different hyperparameters.

Training procedure

Training was completed over 7 epochs with a modified version of the run_summarization.py Huggingface training script.

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 0.0002
  • train_batch_size: 4
  • eval_batch_size: 4
  • seed: 42
  • distributed_type: multi-GPU
  • num_devices: 2
  • gradient_accumulation_steps: 6
  • total_train_batch_size: 48
  • total_eval_batch_size: 8
  • optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
  • lr_scheduler_type: linear
  • num_epochs: 7.0

Training results

Training Loss Epoch Step Validation Loss
2.453 0.28 30 2.0444
2.2692 0.56 60 1.8970
2.1485 0.84 90 1.8373
2.0469 1.12 120 1.8033
1.9954 1.4 150 1.7762
1.9778 1.68 180 1.7593
1.9536 1.96 210 1.7472
1.8524 2.24 240 1.7306
1.8438 2.52 270 1.7255
1.8436 2.8 300 1.7140
1.7765 3.08 330 1.7049
1.7537 3.36 360 1.7057
1.7328 3.64 390 1.6977
1.723 3.92 420 1.6973
1.6592 4.2 450 1.7058
1.6563 4.48 480 1.7034
1.6443 4.76 510 1.6969
1.5782 5.04 540 1.6953
1.509 5.32 570 1.7136
1.5516 5.6 600 1.7064
1.558 5.88 630 1.7045
1.5016 6.16 660 1.7182
1.5288 6.44 690 1.7111
1.4665 6.72 720 1.7030

Framework versions

  • Transformers 4.36.0.dev0
  • Pytorch 2.1.1+cu118
  • Datasets 2.15.0
  • Tokenizers 0.15.0
Downloads last month
12
Safetensors
Model size
297M params
Tensor type
F32
·
Inference Examples
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social visibility and check back later, or deploy to Inference Endpoints (dedicated) instead.

Model tree for vahn9995/longt5-stable-diffusion-prompt

Finetuned
(5)
this model