Model doesnt work

#5
by rjmehta - opened

Model doesnt print anything. Just blank spaces. Using exllamav2.
Please help @TheBloke

INPUT:
settings = ExLlamaV2Sampler.Settings()
settings.temperature = 0.85
settings.top_k = 50
settings.top_p = 0.8
settings.token_repetition_penalty = 1
#settings.disallow_tokens(tokenizer, [tokenizer.eos_token_id])
max_new_tokens = 10

Prompt

prompt = f"""Write a working python code.
/#/#/# Instruction:
Write a working python code to generate 100 random numbers.
/#/#/# Response:

"""
input_ids = tokenizer.encode(prompt)
prompt_tokens = input_ids.shape[-1]
# Make sure CUDA is initialized so we can measure performance
generator.warmup()
# Send prompt to generator to begin stream
time_begin_prompt = time.time()
print (prompt, end = "")
sys.stdout.flush()
generator.set_stop_conditions([])
generator.begin_stream(input_ids, settings)
time_begin_stream = time.time()
generated_tokens = 0
while True:
chunk, eos, _ = generator.stream()
generated_tokens += 1
print (chunk, end = "")
sys.stdout.flush()
if eos or generated_tokens == max_new_tokens: break
time_end = time.time()
time_prompt = time_begin_stream - time_begin_prompt
time_tokens = time_end - time_begin_stream
print()
print()
print(f"Prompt processed in {time_prompt:.2f} seconds, {prompt_tokens} tokens, {prompt_tokens / time_prompt:.2f} tokens/second")
print(f"Response generated in {time_tokens:.2f} seconds, {generated_tokens} tokens, {generated_tokens / time_tokens:.2f} tokens/second")

OUTPUT:

Write a working python code.
/#/#/# Instruction:
Write a working python code to generate 100 random numbers.
/#/#/# Response:

Prompt processed in 0.00 seconds, 32 tokens, 27396.96 tokens/second
Response generated in 0.43 seconds, 10 tokens, 23.49 tokens/second"""

Okay. I had to manually set the rope_scale to 4.0. But gptq doesnt print EOS token.

Okay. I had to manually set the rope_scale to 4.0. But gptq doesnt print EOS token.

hi, i meet a similar issue with VLLM. Do you mean the root cause is rope_scale? where can i modify this? Thank you

I have an issue loading this modell with Text generation web ui. It gives me the error "UnicodeDecodeError: 'charmap' codec can't decode byte 0x81 in position 125: character maps to ". Anyone an idea how to solve this?

Sign up or log in to comment