Text Classification
Transformers
Safetensors
mistral
feature-extraction
reward_model
custom_code
text-generation-inference

Wrong Output in Usage Example Code?

#5
by hmomin - opened

When I run the example code in the Usage section, I don't get the output 47.4404296875 as suggested at the bottom.

Instead, I get -221.7861328125.

Thanks

OpenBMB org

Hi,

We've just fixed a bug in the demo yesterday -- correcting the [\INST] to [/INST] -- and haven't got time yet to test the reward based on the corrected code. If you are using the latest code, it is possible to obtain a different output.

Sign up or log in to comment