General discussion and feedback.

pinned

by Lewdiculous - opened Mar 19

Owner Mar 19

•

@Cran-May - Feel free to perform testing of the requested quants and to share the feedback. Best of luck in your tuning.

Lewdiculous pinned discussion Mar 19

WesPro

Mar 19

Which version should I download when i want to use it in LM Studio(or similiar app) on a notebook with these specs: i5 12500H, 64gb RAM, 3050RTX with 4GB VRAM?

Lewdiculous

Owner Mar 19

•

edited Mar 19

Fastest one be:
https://huggingface.co/Lewdiculous/firefly-gemma-7b-GGUF-IQ-Imatrix/blob/main/firefly-gemma-7b-IQ2_XXS-imatrix.gguf

But this one should offer a good balance of quality at the cost of speed:
https://huggingface.co/Lewdiculous/firefly-gemma-7b-GGUF-IQ-Imatrix/blob/main/firefly-gemma-7b-IQ4_XS-imatrix.gguf

I recommend a Q5 if quality is very important:
https://huggingface.co/Lewdiculous/firefly-gemma-7b-GGUF-IQ-Imatrix/blob/main/firefly-gemma-7b-Q5_K_S-imatrix.gguf

I'm still waiting for @Cran-May 's testing to make sure everything is okay, if you can test and provide feedback that'd also be useful.

WesPro

Mar 19

My first impression after trying the Q6K and IQ4XS versions is that it works ok but didn't blow my mind. You definitely have better models in your collection. I'll keep testing with other prompts and see if they work better.

Lewdiculous

Owner Mar 20

•

edited Mar 20

You definitely have better models in your collection.

Ah, yeah so...

firefly-gemma-7b is trained based on gemma-7b to act as a helpful and harmless AI assistant.

I actually didn't even add this one to the Collection, haha, since it's kind of not like the rest.

This model isn't quite like the others, for example, Eris and InfinityRP benchmark much higher. This was a request for a general use assistant model based on the Google/Gemma arch, not really meant for what we usually do, it's more of a model you'd deploy to run locally as a personal assistant, not really that useful for myself, haha.

Generally speaking stay tuned in the Favorites collection:

https://huggingface.co/collections/Lewdiculous/personal-favorites-65dcbe240e6ad245510519aa

That's where I'll group the more outstanding models. Recently I think there's a lot of experimentation going on, but these:

Should bench/perform a lot better, and are more in line with my own use case.

Cheers and thanks for the feedback, I'm just glad the model isn't broken, as I'm not used to making quants for Gemma based models.

WesPro

Mar 20

Oh yeah, that explains it. I just assumed it was supposed to be an RP model too because firefly is such cute name ;)

Lewdiculous

Owner Mar 20

@WesPro - Oh, you can be sure it would be a firefly anime girl in that case.

:'3

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment