KoboldCpp version 1.64?

by SolidSnacke - opened Apr 30

Discussion

SolidSnacke

Apr 30

Are you from the future?

lexybarton

Apr 30

i believe this fork https://github.com/Nexesenex/kobold.cpp/releases/tag/v1.64a_b2749

Lewdiculous

Owner May 1

•

edited May 1

Not even talking about the fork, just saying that version 1.64 will have the fix merged, haha. So it's safe to recommend 1.64 already, for when it's out.

You can try this fork if* they already have all the fixes merged, but I personally stick to the official releases.

Lewdiculous changed discussion status to closed May 1

saishf

May 1

Not even talking about the fork, just saying that version 1.64 will have the fix merged, haha. So it's safe to recommend 1.64 already, for when it's out.

You can try this fork of they already have all the fixes merged, but I personally stick to the official releases.

I don't believe it will, it's from 2 days prior to the bpe fix. I personally use the fork and it's not any faster for me, just a little less vram. There's no point in switching unless you desperately need 0.2gb of vram extra

Lewdiculous

Owner May 1

We shall wait then.

saishf

May 1

We shall wait then.

Nexesenex is so quick 😭

https://github.com/Nexesenex/kobold.cpp/releases

latest exe supports bpe tokenization and flash attention.
Don't know which version of fa though. my old 20 series is useless for fa2
Support for turing gpus has been coming soon since july 2023 🥲

saishf

May 1

Update on flash attention. It works on turing and it saves A fxck ton of vram even for 8GB!
With FA

Without FA

Both Q4_K_M

saishf

May 1

Q4_K_S makes 24k Llama3 context possible :3
Or makes room for a sd checkpoint ggufified
Or a vision model
But SOVL explodes at 24k

saishf

May 1

We shall wait then.

Wait for offical repo is over!
As of 12 minutes ago :3
https://github.com/LostRuins/koboldcpp/releases/tag/v1.64

Lewdiculous

Owner May 1

nice!

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment