Convert model to GGUF or format compatible with LM Studio

#66

by anthropoleo - opened Apr 4, 2024

Apr 4, 2024

Hi,

Does anyone know how to convert a fine-tuned version of Bert-base-uncased for text classification into a format that allows me to load it using LM Studio or Ollama?

After training the model I pushed it to the hub but know I would like to use it in a way that it's friendly to it's users, not just piping the model on a notebook.

If it helps, the are the files the model created when I pushed it:

ybelkada

Apr 10, 2024

Hi @anthropoleo
BERT is a relatively small model which is not auto-regressive, in most cases using a simple python backend such as transformers suffice for most use-cases I would say, even for running the model locally on CPU.
To convert to GGUF, I would advise you to open an issue on ggml / llama.cpp repositories on GitHub and see if the maintainers are keen to add BERT support !

P00j4n

Sep 6, 2024

Hey, Did you find any way to do it?

muliu

Oct 31, 2024

Hi @anthropoleo , I'm in the same predicament as you, did you find a solution? I'd be grateful if you could share it.

lrq3000

Nov 20, 2024

I would also like to find a similar solution. It would be great to be able to use BERT on any UI that runs llama.cpp

lrq3000

Nov 20, 2024

I found this thread on llama.cpp issue tracker:

https://github.com/ggerganov/llama.cpp/issues/7924

It seems no one succeeded yet in converting BERT to GGUF, there is a lack of interest by experienced quantizers.

lrq3000

Nov 20, 2024

The transformers library also does not yet support BERT as a GGUF anyway, same mapping issue:

https://github.com/huggingface/transformers/issues/34238

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment