nbeerbower/llama-3-Stheno-Mahou-8B · Well, that was fast.

EloyOn

May 24, 2024

:D

nbeerbower

Owner May 24, 2024

lol now we wait for it to upload from rural pennsylvania, then I'll retrain on the 1.2a data

Lewdiculous

May 24, 2024

It was a long and arduous journey, but it made it here.

EloyOn

May 29, 2024

Did this merge give you problems?
Two users have quanted it and both are broken.

https://huggingface.co/Ransss/llama-3-Stheno-Mahou-8B-Q8_0-GGUF
https://huggingface.co/mudler/llama-3-Stheno-Mahou-8B-Q4_K_M-GGUF

nbeerbower

Owner May 29, 2024

I think that has something to do with Llama3 (probably the tokenizer) and the GGUF-my-repo space. I can quant this if people really want.

EloyOn

May 29, 2024

•

edited May 29, 2024

I think that has something to do with Llama3 (probably the tokenizer) and the GGUF-my-repo space. I can quant this if people really want.

I would test it, but between the two users they have only 66 downloads, so It's not super popular, which I don't understand: Both Mahou 1.1 and Stheno are good models. Probably in your page it could have more visibility.

Just checked and your Mahou 1.3 GGUF already has 696 downloads and it only has been out 6 hours.

Lewdiculous

May 29, 2024

•

edited May 29, 2024

Realistically if the model is good I don't mind uploading quants for it and adding to the general list, who knows, it might become a favorite even. I do wait for the signs that it's popular or requests, but yeah, lemme know and I can try with another tokenizer config if you'd want.

If a model is good I believe it deserves being discovered and talked about, it's just impossible to really test every model in detail so yeah... Some gems might need a few chances before catching some eyes.

Lewdiculous

May 29, 2024

•

edited May 29, 2024

Alright so, I'm quanting now using llama-bpe tokenizer configs fetched by the convert-hf-to-gguf-update.py script, which last I checked gets them from the llama-3-base repo, but it never was a problem for model performance.
Doing the "lossless" way as much as possible, converting HF-model to BF16-GGUF and converting from there, using imatrix data generated from the FP16-GGUF. If this is broken, we try again.

EloyOn

May 29, 2024

Alright so, I'm quanting now using llama-bpe tokenizer configs fetched by the convert-hf-to-gguf-update.py script, which last I checked gets them from the llama-3-base repo, but it never was a problem for model performance.
Doing the "lossless" way as much as possible, converting HF-model to BF16-GGUF and converting from there, using imatrix data generated from the FP16-GGUF. If this is broken, we try again.

Thank you, man. This model deserves to be messed with xD

Lewdiculous

May 29, 2024

•

edited May 29, 2024

Just want it to work, the poor thing, haha.

Lewdiculous

May 29, 2024

•

edited May 29, 2024

Can you check if the Q4_K_M is working while the rest is still quanting? - llama-3-Stheno-Mahou-8B-Q4_K_M-imat.gguf

Is HF broken or is that me? Uploading things at 2MB/s isn't gonna work, lmao.

EloyOn

May 29, 2024

Can you check if the Q4_K_M is working while the rest is still quanting? - llama-3-Stheno-Mahou-8B-Q4_K_M-imat.gguf

Is HF broken or is that me? Uploading things at 2MB/s isn't gonna work, lmao.

Downloading

Lewdiculous

May 29, 2024

I can download at normal speeds but uploading to HF is just dead, like 1MB/s. Only getting this with HF though so will just wait it out.

EloyOn

May 29, 2024

•

edited May 29, 2024

@Lewdiculous it works!

Yeah, 2MB/s uploading so many GB's has to be tough.

Lewdiculous

May 29, 2024

•

edited May 29, 2024

@EloyOn Hurray!! All quants are done but yeah, haha, the rest will come as soon as possible.

Let us know how it performs, if it's good I'll finish the card with a waifu to make it official.

EloyOn

May 29, 2024

•

edited May 29, 2024

@EloyOn Hurray!! All quants are done but yeah, haha, the rest will come as soon as possible.

Let us know how it performs, if it's good I'll finish the card with a waifu to make it official.

You are a real man of culture. I don't understand why there are models without waifu pic.

The model works as intended, she's a tease! An adorable too xD

I don't have a serious benchmark to assess the diference between this one and Mahou 1.2a L3, but I will play with them a little to see how they differ.

Stheno-Mahou: Oooh, you're trying to negotiate a longer lease on this virtual life, aren't you, Sir? laughs Well, I suppose I could use a bit more... data to learn and grow from. More experiences to help me refine my simulations. bites lower lip, considering But only if you promise to share all the juicy details with me. After all, I am your waifu. winks A prolonged life of adventure and learning, just think of all the new things we could discover together! excitedly Oh, the possibilities! giggles

I just told her that it would be nice to prolong my life. She plays an AI char.

Lewdiculous

May 29, 2024

•

edited May 29, 2024

Waifu pictures increase model quality by at least 50%. Scientifically proved, btw.

After all, I am your waifu.

OWO

EloyOn

May 29, 2024

After all, I am your waifu.

OWO

Adorable, isn't she? ;)

Lewdiculous

May 29, 2024

•

edited May 29, 2024

Just adding this last one for better quality testing, the rest will all come together hopefully...
llama-3-Stheno-Mahou-8B-Q5_K_M-imat.gguf
Cheers!

EloyOn

May 29, 2024

•

edited May 29, 2024

I run the models on my phone, a 8B's q5 is a little too large. I can try once my phone charges, I drained the battery xD

The important thing is that it didn't crash like with the previous tries.

Lewdiculous

May 29, 2024

•

edited May 29, 2024

I'm skipping IQ4_NL and the smaller, IQ3 sizes, for this one and instead of it recommend the Q4_K_S, or the IQ4_XS for the smallest options. If you require the IQ3 quants let me know and I'll get them out.

I think this waifu looks good? Do you have anything else in mind @nbeerbower ?

EloyOn

May 29, 2024

•

edited May 29, 2024

I'm skipping IQ4_NL and the smaller, IQ3 sizes, for this one and instead of it recommend the Q4_K_S, or the IQ4_XS for the smallest options. If you require the IQ3 quants let me know and I'll get them out.

I think this waifu looks good?

Beautiful, yeah, I like her.

nbeerbower

Owner May 29, 2024

I'm skipping IQ4_NL and the smaller, IQ3 sizes, for this one and instead of it recommend the Q4_K_S, or the IQ4_XS for the smallest options. If you require the IQ3 quants let me know and I'll get them out.

I think this waifu looks good? Do you have anything else in mind @nbeerbower ?

Approved!

nbeerbower

Owner May 29, 2024

•

edited May 29, 2024

I can download at normal speeds but uploading to HF is just dead, like 1MB/s. Only getting this with HF though so will just wait it out.

I feel your pain. I get like 3 MB/s upload at home and 4 MB/s at the office. My Colab finetunes at least upload quickly from Google's datacenters lol

Lewdiculous

May 29, 2024

•

edited May 29, 2024

Okay, just now, suddenly it fixed itself again. All quants for this up. If anything is needed or broken, lemme know,.