gguf?

by LaferriereJC - opened Dec 3, 2023

Discussion

LaferriereJC

Dec 3, 2023

@TheBloke
Can you gguf'izer these?

YAKOVNUKJHJ

Dec 3, 2023

I'm waiting for it too, it's the best model I've met so far

deleted

Dec 3, 2023

•

edited Dec 3, 2023

The Bloke just released it.

Edit: I'm getting the newline character typed vs applied, such as <0x0A><0x0A> rather than new paragraphs. Is this just the case for the GGUF version, the GPT4ALL app I'm using... or does it also happen with this unquantized version?

alvarobartt

Argilla org Dec 4, 2023

Hi @LaferriereJC @YAKOVNUKJHJ , good news ✨ The awesome @TheBloke has already quantized those (announced recently at https://twitter.com/alvarobartt/status/1731587062522929520) so you should already be able to use those, either GGUF or AWQ.

alvarobartt

Argilla org Dec 4, 2023

Edit: I'm getting the newline character typed vs applied, such as <0x0A><0x0A> rather than new paragraphs. Is this just the case for the GGUF version, the GPT4ALL app I'm using... or does it also happen with this unquantized version?

Hi here @Phil337 could you elaborate a bit on this? Is it related to the GGUF quantized weights, or just to prompting within the Notus model?

deleted

Dec 4, 2023

@alvarobartt It must be due to the GGUF version (I'm using Q4_0) or how it mixes with GPT4All because it's EVERY newline and paragraph regardless of prompt. There's a known token issue with GPT4All and the latest GGUF implementation that they say on the Github page is going to be fixed with the next update, so maybe that's it. Other than this Lotus performed very well.

alvarobartt

Argilla org Dec 4, 2023

Happy to hear that @Phil337, we'll also play around a bit with the quantized versions this week!

alvarobartt changed discussion status to closed Dec 4, 2023

alvarobartt

Argilla org Dec 4, 2023

Hi again @Phil337 after reading a bit more it seems that the issue of the <0x0A> tokens was because the file tokenizer.model (SetencePiece based tokenizer, slow tokenizer) was missing and the GGUF quantized version had to build the tokenizer from the existing vocab file and that was leading to some errors, I saw that also being reported at https://huggingface.co/TheBloke/Starling-LM-7B-alpha-GGUF/discussions/1, and finally decided to port it from https://huggingface.co/HuggingFaceH4/zephyr-7b-beta/blob/main/tokenizer.model, as we're using the same tokenizer. Also thanks to @plaguss for internally reporting it!

deleted

Dec 4, 2023

@alvarobartt Thanks for looking into it and finding the cause.

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment