Edit model card

pszemraj/opt-350m-multiprompt

Open In Colab

Generate/augment your prompt with a model trained on a large & diverse prompt dataset.

This model is a fine-tuned version of facebook/opt-350m on the pszemraj/text2image-prompts-multi dataset. It achieves the following results on the evaluation set:

  • Loss: 1.6669
  • eval steps per second: 16.21
  • perplexity: 5.29

Example

landscape of florida


The above example was created with DALL-E 2 but will of course work with any text2image model.

Intended uses & limitations

  • The model will generate augmentations that are biased towards the training data, i.e. what people already asked for in the SD/midjourney discords, etc. Creating a larger dataset was an attempt at mitigating this through more data from different datasets.

Training and evaluation data

See the pszemraj/text2image-prompts-multi dataset card for details. The dataset is a compilation of several text-to-image prompt datasets on huggingface :)

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 0.0002
  • train_batch_size: 8
  • eval_batch_size: 4
  • seed: 42
  • distributed_type: multi-GPU
  • num_devices: 2
  • gradient_accumulation_steps: 16
  • total_train_batch_size: 256
  • total_eval_batch_size: 8
  • optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
  • lr_scheduler_type: cosine
  • lr_scheduler_warmup_ratio: 0.04
  • num_epochs: 4.0

Training results

Training Loss Epoch Step Validation Loss
2.1677 1.0 990 2.0888
1.856 2.0 1980 1.8215
1.6864 3.0 2970 1.6935
1.6228 4.0 3960 1.6670

Framework versions

  • Transformers 4.25.0.dev0
  • Pytorch 1.13.0+cu117
  • Datasets 2.6.1
  • Tokenizers 0.13.1
Downloads last month
65
Safetensors
Model size
331M params
Tensor type
BF16
·
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social visibility and check back later, or deploy to Inference Endpoints (dedicated) instead.

Dataset used to train pszemraj/opt-350m-multiprompt