It works.

by Yuuru - opened Dec 11, 2023

Yuuru

Dec 11, 2023

Dec 11, 2023

•

Just tested Q4_0 . Runs fine.

Owner Dec 11, 2023

•

Please re-download the files - the rope_theta was wrong, and is now fixed. Apparently this affects generation quality

Yuuru

Dec 11, 2023

Apparently this affects generation quality

It so much smarter now. Tried 3Q with full offload, it's comfortably fast.

llama_print_timings: prompt eval time = 1357.32 ms / 63 tokens ( 21.54 ms per token, 46.42 tokens per second)

Dec 12, 2023

Tested Q4_K_M. Runs perfect. Thank you!

Dec 12, 2023

What is the minimum system requirement? Did you run it locally?

Dec 13, 2023

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment