The following part of your input was truncated because CLIP can only handle sequences up to 77 tokens
#175
by
gebaltso
- opened
How can I overcome this? Perhaps by changing text-encoder? And if so is there any example for that on how to do it? Thanks in advance.
*Edit using both prompt and prompt_3 (T5):
image = pipe(
prompt=prompt,
prompt_3=prompt_3,
negative_prompt="",
num_inference_steps=28,
guidance_scale=4.5,
max_sequence_length=512,
).images[0]