GGUF versions

by christianweyer - opened Apr 28, 2024

Discussion

christianweyer

Apr 28, 2024

Hey Ronan,

when will we see .gguf versions to use with llama.cpp?
Thx!

RonanMcGovern

Trelis org Apr 29, 2024

Howdy, I'll aim to get it up this week.

christianweyer

Apr 29, 2024

Perfect, thank you!

ivanpzk

Apr 30, 2024

also very very interested by the gguf version :) thanks a lot, i definitely need this function calling version

RonanMcGovern

Trelis org May 1, 2024

Ok, the issue here is with the long RoPE that is being used by Microsoft. It's causing issues with TGI and with ggufs. I'm tracking this issue: https://github.com/ggerganov/llama.cpp/issues/6849

In the meantime, I plan to release a 4k model, is that useful or the 128k is key?

ivanpzk

May 2, 2024

For me 4k is already useful, 128k would be 20% of my usage

christianweyer

May 2, 2024

4k is also useful, yes.

RonanMcGovern

Trelis org May 3, 2024

Noted, will aim to get on this late next week, I'm travelling, sorry for the delay

Voxels

May 5, 2024

The GGUF (4k or 128k) would be very helpful. ❤️

ivanpzk

May 7, 2024

•

edited May 7, 2024

i'm so far running gorilla open function v2. how will compete phi 3 function calling ? gorilla launched a ladder to compare function calling model . anyone have insights about the relevance of the lader? https://gorilla.cs.berkeley.edu/leaderboard.html

RonanMcGovern

Trelis org May 13, 2024

This is taking a long time to get resolved on Llama.cpp for making the GGUF.

Would an MLX quant be useful instead (like this)?

or is gguf really needed because that's what is supported by libraries/apps like lm studio?

christianweyer

May 14, 2024

I am using ollama with LiteLLM, so the gguf would be great.

christianweyer

Aug 28, 2024

Has there been any update on this @RonanMcGovern ? :-)

RonanMcGovern

Trelis org Aug 30, 2024

Howdy, so I don't think this issue got resolved for making 128k ggufs, but I have asked about phi 3.5 where it seems possible - if I get confirmation I can see about doing a train of phi 3.5 for function calling.

https://huggingface.co/bartowski/Phi-3.5-mini-instruct-GGUF/discussions/3

Best, Ronan

christianweyer

Sep 3, 2024

Way to go @RonanMcGovern - thx!

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment