Text Generation
Transformers
PyTorch
Safetensors
English
olmo
custom_code

Will there be quantized versions? (GGUF)

#4
by alexcardo - opened

Can this model be quantized and converted to the GGUF formate to use it with llama.cpp?

Did you check visual STudio for ai extention/2convert ?

Allen Institute for AI org

@TheBlocki plz thx
we'll work on more code integrations if anything specific is wrong.

I have been trying to hack it to work this morning. I added the new arch "OlmoModelForCausalLM," but I'm not sure if there is an existing compatible one like MODEL_ARCH.LLAMA.

As a likely result, I am running into deeper model issues with llama.cpp, for example.

Loading model: OLMo-7B
gguf: This GGUF file is for Little Endian only
Set model parameters
Set model tokenizer
The repository for /backup_disks/OLMo-7B contains custom code which must be executed to correctly load the model. You can inspect the repository content at https://hf.co//backup_disks/OLMo-7B.
You can avoid this prompt in future by passing the argument `trust_remote_code=True`.

Do you wish to run the custom code? [y/N] y
gguf: Adding 50009 merge(s).
gguf: Setting special token type eos to 50279
gguf: Setting special token type pad to 1
Exporting model to 'olmo.gguf'
gguf: loading model part 'pytorch_model.bin'
Can not map tensor 'model.transformer.wte.weight'

Transformers needs to be updated to the latest version from Github, but ai2-olmo seems to need a version of the torch that is hard to resolve. I will give it one last attempt to try with this version torch-2.3.0a0+git52b679d. But I fear a proper arch needs to be added to llama.ccp, and all my attempts are to no avail. In that regard, I am trying to use the llama.cpp convert_hf_to_gguf.py, just too early, I think, at this point.

Interested to see if anyone can make GGUF work for Olmo arch.

Any news about the GGUF versions? Could someone finally make them?

I tried, but I couldn't add this architect into llama.cpp and make the required changes.

I hope they add this feature faster to llama.cpp

Awesome! Thanks @eleius , shall we do it or it's been done already?

It seems @nopperl just did the GGUFs, I'll have to try them

Sign up or log in to comment