MaziyarPanahi/Mistral-Large-Instruct-2407-GGUF

Aug 1

Can I ask you how to achieve the conversion? Why do I get the error shown below?

(llm_venv_llamacpp) xlab@xlab:/mnt/Model/MistralAI/llm_llamacpp$ python convert_hf_to_gguf.py /mnt/Model/MistralAI/Mistral-Large-Instruct-2407 --outfile ../llm_quantized/mistral_large2_instruct_f16.gguf --outtype f16 --no-lazy
INFO:hf-to-gguf:Loading model: Mistral-Large-Instruct-2407
INFO:gguf.gguf_writer:gguf: This GGUF file is for Little Endian only
INFO:hf-to-gguf:Exporting model...
INFO:hf-to-gguf:Set meta model
INFO:hf-to-gguf:Set model parameters
INFO:hf-to-gguf:gguf: context length = 131072
INFO:hf-to-gguf:gguf: embedding length = 12288
INFO:hf-to-gguf:gguf: feed forward length = 28672
INFO:hf-to-gguf:gguf: head count = 96
INFO:hf-to-gguf:gguf: key-value head count = 8
INFO:hf-to-gguf:gguf: rope theta = 1000000.0
INFO:hf-to-gguf:gguf: rms norm epsilon = 1e-05
INFO:hf-to-gguf:gguf: file type = 1
INFO:hf-to-gguf:Set model tokenizer
INFO:gguf.vocab:Setting special token type bos to 1
INFO:gguf.vocab:Setting special token type eos to 2
INFO:gguf.vocab:Setting special token type unk to 0
INFO:gguf.vocab:Setting add_bos_token to True
INFO:gguf.vocab:Setting add_eos_token to False
INFO:gguf.vocab:Setting chat_template to {%- if messages[0]['role'] == 'system' %}
    {%- set system_message = messages[0]['content'] %}
    {%- set loop_messages = messages[1:] %}
{%- else %}
    {%- set loop_messages = messages %}
{%- endif %}

{{- bos_token }}
{%- for message in loop_messages %}
    {%- if (message['role'] == 'user') != (loop.index0 % 2 == 0) %}
        {{- raise_exception('After the optional system message, conversation roles must alternate user/assistant/user/assistant/...') }}
    {%- endif %}
    {%- if message['role'] == 'user' %}
        {%- if loop.last and system_message is defined %}
            {{- '[INST] ' + system_message + '\n\n' + message['content'] + '[/INST]' }}
        {%- else %}
            {{- '[INST] ' + message['content'] + '[/INST]' }}
        {%- endif %}
    {%- elif message['role'] == 'assistant' %}
        {{- ' ' + message['content'] + eos_token}}
    {%- else %}
        {{- raise_exception('Only user and assistant roles are supported, with the exception of an initial optional system message!') }}
    {%- endif %}
{%- endfor %}

INFO:hf-to-gguf:Set model quantization version
INFO:gguf.gguf_writer:Writing the following files:
INFO:gguf.gguf_writer:../llm_quantized/mistral_large2_instruct_f16.gguf: n_tensors = 0, total_size = negligible - metadata only
Writing: 0.00byte [00:00, ?byte/s]
INFO:hf-to-gguf:Model successfully exported to ../llm_quantized/mistral_large2_instruct_f16.gguf

MaziyarPanahi

Owner Aug 1

Hi,

Where is the error? Model successfully exported to ../llm_quantized/mistral_large2_instruct_f16.gguf means it finished converting with no error.

Jasper17

Aug 1

The error comes from the last two lines, you can see that there is no data written, and the generated gguf file is only a few hundred kilobytes, may I ask if you have changed some parameter of llamacpp during quantisation?

INFO:gguf.gguf_writer:../llm_quantized/mistral_large2_instruct_f16.gguf: n_tensors = 0, total_size = negligible - metadata only
Writing: 0.00byte [00:00, ?byte/s]

MaziyarPanahi

Owner Aug 1

I would pull the latest llama.cpp changes from the git and make sure you do make clean and make to have all the latest chagnes

MaziyarPanahi
/

Mistral-Large-Instruct-2407-GGUF

How to convert GGUF