unsloth/llama-3-8b-Instruct-bnb-4bit · BitsAndBytesConfig error

May 1, 2024

Hi, trying to load the model with

model_name = "unsloth/llama-3-8b-Instruct-bnb-4bit"
model, tokenizer = FastLanguageModel.from_pretrained(
model_name = model_name, # Choose ANY! eg teknium/OpenHermes-2.5-Mistral-7B
max_seq_length = max_seq_length,
dtype = dtype,
load_in_4bit = load_in_4bit,
quantization_config=nf4_config,
# token = "hf_...", # use one if using gated models like meta-llama/Llama-2-7b-hf
)

I'm getting an error Unused kwargs: ['_load_in_4bit', '_load_in_8bit', 'quant_method']. These kwargs are not used in <class 'transformers.utils.quantization_config.BitsAndBytesConfig'>.

I'm using transformers-4.40.1.
I don't get the error with transformers-4.39.3.

I see the config.json uses these kwargs

danielhanchen

Unsloth AI org May 13, 2024

Oh no need to worry on the error - just ignore it - it should still work - I might update all of them to use the new kwargs

bcartel1

May 13, 2024

Hi There,
i have downloaded this model to use with ollama, i was wondering if you could leave a note on how to do this, And if not possible with ollama then how should i run it. i am new at this as you can tell :) .
thank you

shimmyshimmer

Unsloth AI org May 15, 2024

Hi There,
i have downloaded this model to use with ollama, i was wondering if you could leave a note on how to do this, And if not possible with ollama then how should i run it. i am new at this as you can tell :) .
thank you

We have a Wiki which details a bit about Ollama: https://github.com/unslothai/unsloth/wiki#use-unsloth-lora-adapter-with-ollama