openvision / README.md
dataautogpt3's picture
Update README.md (#2)
4f53080 verified
metadata
pipeline_tag: text-to-image
widget:
  - text: 'a wolf hollowing at the moon url: wolf.png'
  - text: a baseball bat on the beach
    output:
      url: baseball.png
  - text: space
    output:
      url: space.png
  - text: >-
      green dragon, flying, sky, yellow eyes, teeth, wings up, tail, horns,
      solo, clouds,
       url: dragon.png
  - text: >-
      (impressionistic realism by csybgh), a 50 something male, working in
      banking, very short dyed dark curly balding hair, Afro-Asiatic ancestry,
      talks a lot but listens poorly, stuck in the past, wearing a suit, he has
      a certain charm, bronze skintone, sitting in a bar at night, he is smoking
      and feeling cool, drunk on plum wine, masterpiece, 8k, hyper detailed,
      smokey ambiance, perfect hands AND fingers
    output:
      url: Afro-Asiatic.png
  - text: a cat wearing sunglasses in the summer
    output:
      url: sunglasses.png
  - text: close up portrait of an old woman
    output:
      url: oldwoman.png
  - text: fishing boat, bioluminescent sky
    output:
      url: boat.png
license: apache-2.0
Prompt
a baseball bat on the beach
Prompt
space
Prompt
(impressionistic realism by csybgh), a 50 something male, working in banking, very short dyed dark curly balding hair, Afro-Asiatic ancestry, talks a lot but listens poorly, stuck in the past, wearing a suit, he has a certain charm, bronze skintone, sitting in a bar at night, he is smoking and feeling cool, drunk on plum wine, masterpiece, 8k, hyper detailed, smokey ambiance, perfect hands AND fingers
Prompt
a cat wearing sunglasses in the summer
Prompt
close up portrait of an old woman
Prompt
fishing boat, bioluminescent sky

OpenVision (v1): Midjourney Aesthetic for All Your Images

OpenVision is a style enhancement of ProteusV0.4 that seamlessly incorporates the captivating Midjourney aesthetic into every image you generate.

OpenVision excels at that unspeakable style midjourney is renowed for, while still retaining a good range and crisp details - especially on portraits!

By baking the Midjourney aesthetic directly into the model, OpenVision eliminates the need for manual adjustments or post-processing.

All synthetic images were generated using the Bittensor Network. Bittensor will decentralise AI - and building SOTA open source models is key - OpenVision is a small step in our grand journey

Optimal Settings

  • CFG: 1.5 - 2
  • Sampler: Euler Ancestral
  • Steps: 30 - 40
  • Resolution: 1280x1280 (Aesthetic++) or 1024x1024 (Fidelity++)

Use it with 🧨 diffusers

import torch
from diffusers import (
    StableDiffusionXLPipeline, 
    AutoencoderKL
)

# Load VAE component
vae = AutoencoderKL.from_pretrained(
    "madebyollin/sdxl-vae-fp16-fix", 
    torch_dtype=torch.float16
)

# Configure the pipeline
pipe = StableDiffusionXLPipeline.from_pretrained(
    "Corcelio/openvision", 
    vae=vae,
    torch_dtype=torch.float16
)
pipe.to('cuda')

# Define prompts and generate image
prompt = "a cat wearing sunglasses in the summer"
negative_prompt = ""

image = pipe(
    prompt, 
    negative_prompt=negative_prompt, 
    width=1280,
    height=1280,
    guidance_scale=1.5,
    num_inference_steps=30
).images[0]

Credits

Made by Corcel [ https://corcel.io/ ]