--- license: bigscience-bloom-rail-1.0 tags: - generated_from_trainer - stable-diffusion - diffusion model-index: - name: bloom-560m-finetuned-sd-prompts results: [] datasets: - Gustavosta/Stable-Diffusion-Prompts widget: - text: "Prompt: young, curly haired, redhead Natalie Portman as a" - text: "Prompt: a powerful energy woman, by alexander fedosav" inference: parameters: eos_token_id: 2 max_length: 128 --- # bloom-560m-finetuned-sd-prompts This model is a fine-tuned version of [bigscience/bloom-560m](https://huggingface.co/bigscience/bloom-560m) on the [Gustavosta/Stable-Diffusion-Prompts](https://huggingface.co/datasets/Gustavosta/Stable-Diffusion-Prompts) dataset. It achieves the following results on the evaluation set: - Loss: 0.8742 ## Example of usage ```py import torch from transformers import BloomTokenizerFast, BloomForCausalLM device = 'cuda' if torch.cuda.is_available() else 'cpu' ckpt = 'mrm8488/bloom-560m-finetuned-sd-prompts' tokenizer = BloomTokenizerFast.from_pretrained(ckpt) model = BloomForCausalLM.from_pretrained(ckpt).to(device) def generate_prompt(text): inputs = tokenizer(text, return_tensors='pt') input_ids = inputs.input_ids.to(device) attention_mask = inputs.attention_mask.to(device) output = model.generate(input_ids, attention_mask=attention_mask, repetition_penalty=1.05, max_length=2048, eos_token_id=tokenizer.eos_token_id) return tokenizer.decode(output[0], skip_special_tokens=False) text = "Prompt: pikachu dinning in the eiffel tower" generate_prompt(text) # Output: Prompt: pikachu dinning in the eiffel tower, intricate, elegant, highly detailed, digital painting, artstation, concept art, smooth, sharp focus, illustration, art by artgerm and greg rutkowski and alphonse mucha ``` ## Model description More information needed ## Intended uses & limitations More information needed ## Training and evaluation data More information needed ## Training procedure ### Training hyperparameters The following hyperparameters were used during training: - learning_rate: 5e-05 - train_batch_size: 1 - eval_batch_size: 1 - seed: 42 - gradient_accumulation_steps: 4 - total_train_batch_size: 4 - optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08 - lr_scheduler_type: linear - num_epochs: 2 - mixed_precision_training: Native AMP ### Training results | Training Loss | Epoch | Step | Validation Loss | |:-------------:|:-----:|:----:|:---------------:| | 2.6743 | 0.17 | 100 | 2.0891 | | 1.8919 | 0.33 | 200 | 1.7191 | | 1.5907 | 0.5 | 300 | 1.4454 | | 1.3865 | 0.67 | 400 | 1.3247 | | 1.2487 | 0.83 | 500 | 1.2150 | | 1.1565 | 1.0 | 600 | 1.1031 | | 0.896 | 1.17 | 700 | 1.0612 | | 0.8389 | 1.33 | 800 | 0.9994 | | 0.8071 | 1.5 | 900 | 0.9530 | | 0.7628 | 1.67 | 1000 | 0.9206 | | 0.7423 | 1.83 | 1100 | 0.8883 | | 0.7155 | 2.0 | 1200 | 0.8742 | ### Framework versions - Transformers 4.22.1 - Pytorch 1.12.1+cu113 - Datasets 2.5.1 - Tokenizers 0.12.1