gguf in llama cpp

by Bearsaerker - opened Feb 5

Feb 5

Would this also work quantized for long context in llama cpp or are there any special dependencies which are specific to the implementation in the model card?

namespace-Pt

Owner Feb 6

Hi, I haven't used llama-cpp before. There's no special dependencies other than pytorch==2.1.2 transformers==4.36.1 accelerate==0.25.0 for this implementation.

MaziyarPanahi

Feb 17

I get this error when trying to convert to GGUF:

    raise Exception(f"Unexpected tensor name: {name}")
Exception: Unexpected tensor name: model.beacon_embed_tokens.weight

HR1777

Feb 17

Does anyone know how we can use this model quantized?

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment