Example code on running the model?

#7
by bwzimpel - opened

Hi, do you have example code on running the model? I'd love to have some pointers. Thanks.

This model type has to be run on exllamav2 (https://github.com/turboderp/exllamav2)

And there it shows an small example how to run it in the README with code.

The another option is using a UI, like oobabooga/text-generation-webui (https://github.com/oobabooga/text-generation-webui), install it, place a model in the text-generation-webui folder and then on the moder loader select exllamav2_hf (though, it should detect it automatically) and then load the model there.

Thanks for the pointer. I was able to get the model running with exllamav2. But it seems it never outputs stop token. It will just generate until the max token. How to properly instruct it for stop token?

Sign up or log in to comment