mac studio : loading model vocabulary: unknown pre-tokenizer type: 'grok-2'

#5
by cloudyu - opened

llama_model_loader: - type f32: 321 tensors
llama_model_loader: - type q8_0: 128 tensors
llama_model_loader: - type q4_K: 385 tensors
llama_model_loader: - type q5_K: 64 tensors
llama_model_loader: - type q6_K: 65 tensors
print_info: file format = GGUF V3 (latest)
print_info: file type = Q4_K - Medium
print_info: file size = 152.78 GiB (4.87 BPW)
llama_model_load: error loading model: error loading model vocabulary: unknown pre-tokenizer type: 'grok-2'

./llama-cli --version
version: 6455 (d032a1b09)
built with Apple clang version 17.0.0 (clang-1700.0.13.3) for arm64-apple-darwin24.5.0

on mac studio I have pull the last llama.cpp and build successfully

git fetch origin pull/15539/head:MASTER && git checkout MASTER && cd ..
From https://github.com/ggerganov/llama.cpp

  • [new ref] refs/pull/15539/head -> MASTER
    M .github/workflows/build.yml
    M .github/workflows/release.yml
    M CONTRIBUTING.md
    M ci/run.sh
    M common/arg.cpp
    M common/common.h
    M convert_hf_to_gguf.py
    M convert_hf_to_gguf_update.py
    M docs/backend/CANN.md
    M docs/build-s390x.md
    M docs/ops.md
    M docs/ops/zDNN.csv
    M examples/eval-callback/eval-callback.cpp
    M examples/model-conversion/requirements.txt
    M examples/model-conversion/scripts/causal/run-org-model.py
    M ggml/include/ggml-backend.h
    M ggml/include/ggml-metal.h
    M ggml/include/ggml-zdnn.h
    M ggml/src/ggml-backend-impl.h
    M ggml/src/ggml-backend-reg.cpp
    M ggml/src/ggml-cann/aclnn_ops.cpp
    M ggml/src/ggml-cann/common.h
    M ggml/src/ggml-cann/ggml-cann.cpp
    M ggml/src/ggml-cpu/CMakeLists.txt
    M ggml/src/ggml-cpu/kleidiai/kleidiai.cpp
    M ggml/src/ggml-cpu/ops.cpp
    M ggml/src/ggml-cuda/CMakeLists.txt
    M ggml/src/ggml-cuda/binbcast.cu
    M ggml/src/ggml-cuda/common.cuh
    M ggml/src/ggml-cuda/fattn-tile.cu
    M ggml/src/ggml-cuda/getrows.cu
    M ggml/src/ggml-cuda/ggml-cuda.cu
    M ggml/src/ggml-cuda/mma.cuh
    M ggml/src/ggml-cuda/mmf.cu
    M ggml/src/ggml-cuda/mmf.cuh
    M ggml/src/ggml-cuda/template-instances/generate_cu_files.py
    A ggml/src/ggml-cuda/template-instances/mmf-instance-ncols_1.cu
    A ggml/src/ggml-cuda/template-instances/mmf-instance-ncols_10.cu
    A ggml/src/ggml-cuda/template-instances/mmf-instance-ncols_11.cu
    A ggml/src/ggml-cuda/template-instances/mmf-instance-ncols_12.cu
    A ggml/src/ggml-cuda/template-instances/mmf-instance-ncols_13.cu
    A ggml/src/ggml-cuda/template-instances/mmf-instance-ncols_14.cu
    A ggml/src/ggml-cuda/template-instances/mmf-instance-ncols_15.cu
    A ggml/src/ggml-cuda/template-instances/mmf-instance-ncols_16.cu
    A ggml/src/ggml-cuda/template-instances/mmf-instance-ncols_2.cu
    A ggml/src/ggml-cuda/template-instances/mmf-instance-ncols_3.cu
    A ggml/src/ggml-cuda/template-instances/mmf-instance-ncols_4.cu
    A ggml/src/ggml-cuda/template-instances/mmf-instance-ncols_5.cu
    A ggml/src/ggml-cuda/template-instances/mmf-instance-ncols_6.cu
    A ggml/src/ggml-cuda/template-instances/mmf-instance-ncols_7.cu
    A ggml/src/ggml-cuda/template-instances/mmf-instance-ncols_8.cu
    A ggml/src/ggml-cuda/template-instances/mmf-instance-ncols_9.cu
    M ggml/src/ggml-cuda/vendors/hip.h
    M ggml/src/ggml-metal/CMakeLists.txt
    A ggml/src/ggml-metal/ggml-metal-common.cpp
    A ggml/src/ggml-metal/ggml-metal-common.h
    M ggml/src/ggml-metal/ggml-metal.m
    M ggml/src/ggml-metal/ggml-metal.metal
    M ggml/src/ggml-sycl/binbcast.cpp
    M ggml/src/ggml-sycl/concat.cpp
    M ggml/src/ggml-sycl/conv.cpp
    M ggml/src/ggml-sycl/convert.cpp
    M ggml/src/ggml-sycl/cpy.cpp
    M ggml/src/ggml-sycl/dmmv.cpp
    M ggml/src/ggml-sycl/dpct/helper.hpp
    M ggml/src/ggml-sycl/element_wise.cpp
    M ggml/src/ggml-sycl/getrows.cpp
    M ggml/src/ggml-sycl/ggml-sycl.cpp
    M ggml/src/ggml-sycl/gla.cpp
    M ggml/src/ggml-sycl/im2col.cpp
    M ggml/src/ggml-sycl/mmq.cpp
    M ggml/src/ggml-sycl/mmvq.cpp
    M ggml/src/ggml-sycl/norm.cpp
    M ggml/src/ggml-sycl/rope.cpp
    M ggml/src/ggml-sycl/set_rows.cpp
    M ggml/src/ggml-sycl/softmax.cpp
    M ggml/src/ggml-sycl/tsembd.cpp
    M ggml/src/ggml-sycl/wkv.cpp
    M ggml/src/ggml-vulkan/ggml-vulkan.cpp
    M ggml/src/ggml-vulkan/vulkan-shaders/dequant_iq2_s.comp
    M ggml/src/ggml-vulkan/vulkan-shaders/dequant_iq2_xxs.comp
    M ggml/src/ggml-vulkan/vulkan-shaders/dequant_iq3_s.comp
    M ggml/src/ggml-vulkan/vulkan-shaders/dequant_iq3_xxs.comp
    M ggml/src/ggml-vulkan/vulkan-shaders/soft_max_back.comp
    M ggml/src/ggml-zdnn/ggml-zdnn-impl.h
    M ggml/src/ggml-zdnn/ggml-zdnn.cpp
    M gguf-py/gguf/constants.py
    M gguf-py/gguf/gguf_writer.py
    M gguf-py/gguf/tensor_mapping.py
    A media/llama1-icon-transparent.png
    A media/llama1-icon-transparent.svg
    M requirements/requirements-convert_hf_to_gguf.txt
    M requirements/requirements-convert_legacy_llama.txt
    M requirements/requirements-tool_bench.txt
    M src/llama-arch.cpp
    M src/llama-arch.h
    M src/llama-chat.cpp
    M src/llama-chat.h
    M src/llama-context.cpp
    M src/llama-graph.cpp
    M src/llama-hparams.h
    M src/llama-model.cpp
    M src/llama-quant.cpp
    M src/llama-vocab.cpp
    M src/llama-vocab.h
    M src/llama.cpp
    M tests/.gitignore
    M tests/test-backend-ops.cpp
    M tests/test-tokenizer-random.py
    M tools/llama-bench/llama-bench.cpp
    M tools/main/README.md
    M tools/mtmd/clip.cpp
    M tools/mtmd/legacy-models/minicpmv-convert-image-encoder-to-gguf.py
    M tools/mtmd/requirements.txt
    M tools/rpc/rpc-server.cpp
    M tools/server/server.cpp
    M tools/server/tests/requirements.txt
    Switched to branch 'MASTER'

Sign up or log in to comment