Oct 18, 2023

I am getting a warning of the "Number of tokens exceeded maximum context length (512)" as shown in the screenshot below.
how to solve this issue.
code :
def load_llm():

Load the locally downloaded model here

llm = CTransformers(
model = "TheBloke/Mistral-7B-Instruct-v0.1-GGUF",
model_type="llama",
max_new_tokens = 512,
temperature = 0.5
)
return llm

infopz512

Oct 18, 2023

You need to set the correct value via the config parameters

config = {'max_new_tokens': 400, 'temperature': 0, 'context_length': 4096}
llm = CTransformers(model='TheBloke/Mistral-7B-Instruct-v0.1-GGUF',model_file="mistral-7b-instruct-v0.1.Q8_0.gguf", config=config)

esuriddick

Dec 12, 2023

•

edited Dec 12, 2023

Based on the above, I tried the following and I still get an error:
from ctransformers import AutoModelForCausalLM
model_chat_ckpt = "TheBloke/Mistral-7B-Instruct-v0.1-GGUF"
model_chat_file = 'mistral-7b-instruct-v0.1.Q4_K_M.gguf'
model_chat_type = 'mistral'

config = {'context_length' : 4096}
model = AutoModelForCausalLM.from_pretrained(model_path_or_repo_id = model_chat_path
,model_type = model_chat_type
,model_file = model_chat_file
,local_files_only = True
,config = config.config
)

AttributeError: 'dict' object has no attribute 'config'