Replacing models to reduce VRAM

#18

by Deniaud - opened Dec 3, 2024

Dec 3, 2024

Greetings! To date we know that it is possible to reduce VRAM consumption by using fp8 or GGUF quant Flux version.
But I also wanted to ask you about the possible replacement of Qwen 7B with Qwen 2B, and especially with its quant versions. Is this possible and will it make sense in this case, or will it be at the level of T5 in terms of quality?

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

Your need to confirm your account before you can post a new comment.

· Sign up or log in to comment