Kunoichi with long legs!

by Spacellary - opened Feb 17, 2024

Feb 17, 2024

•

edited Feb 17, 2024

Keep up the good work, I feel like it does longer contexts better than the original, not sure about what I'm sacrificing for it yet, but it's a positive first impression.

Looking forward especially if you're cooking with Kunoichi again.

Nitral-AI

Owner Feb 18, 2024

•

edited Feb 18, 2024

Honestly i need to graph out perplexity between the original and extended. But thank you for the feedback! it was bit weird to get these models to merge at all, however i think it can be improved further.

Spacellary

Feb 18, 2024

@Test157t Absolutely, that'd be nice.

Lewdiculous

Feb 23, 2024

•

edited Feb 23, 2024

@Test157t - Honestly there's something about this model that just jives with my character cards so well. It's a special good boy. Couple other people I recommended it to also were positive on it. It just feels "consistent".

Nitral-AI

Owner Feb 23, 2024

At this rate i might have to do a v2 for this one.

Lewdiculous

Feb 25, 2024

@Test157t -- Booyah! Not sure it's worth testing so late but I'm uploading the GGUF-Imatrix quants from the F16 for this model here, including the new IQ3_S that should be compatible with the next version of KCPP.

Lewdiculous

Feb 28, 2024

•

edited Feb 28, 2024

@Test157t -- Already uploading v2 quants from the new config to the same repo, if no major issues are found than this model should be good on Quants, the new ones are tagged with v2 in the file names for easy distinction. If needed I'll do a v3 or whatever as necessary. Love me some longer context :')

Nitral-AI

Owner Feb 28, 2024

@Lewdiculous Thank you! the long text version (1.2) is still "regarded" so im going have to go back to the drawing board with that one. Been testing lelantacles v5 for comparison and it kicks ass at 16k still.

Lewdiculous

Feb 28, 2024

Been testing lelantacles v5 for comparison and it kicks ass at 16k still

In a way it was progress :)

Will hold off until 1.2 or more realistically the next version entirely is cleared for testing then. It's all about experimenting, cheers!

Lewdiculous

Feb 28, 2024

•

edited Feb 28, 2024

Fun times, I look outside and there's already IQ4_NL and now a new IQ4_XS merged 11 hours ago. GGUF moving fast xD

Nitral-AI

Owner Feb 28, 2024

•

edited Feb 28, 2024

Might redo 1.2 without normalizing and in slerp instead of dare-ties. but i will test it locally along with the quants before i clear it for use.

Nitral-AI

Owner Feb 28, 2024

Oh it would appear that ive missed that, im falling behind it would seem.

Lewdiculous

Feb 28, 2024

•

edited Feb 28, 2024

@Test157t, yeah, things are moving quite fast in GGUF land now.

IQ4_
NL: https://github.com/ggerganov/llama.cpp/pull/5590
XS: https://github.com/ggerganov/llama.cpp/pull/5747

Honestly with the very small loss in quality it's much more interesting for me to just push for higher context, I'm staying around the same VRAM usage, but with 4K more context.

Need to see how it goes, maybe you can do real Bbenchmarks.

I've uploaded these 2 news quant-types for Kunocchini-7b-128k-test-GGUF-Imatrix, they are in the v2 list.

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment