Problem with using Mistralai model api from Huggingface

#26
by Hawks101 - opened

Hi, I am using the Mistral-7B-Instruct-v0.1 model through Huggingface's api for question answering a pdf and its working but the response is not long...it gets cut halfway after one sentence . Please help.

I have the exact same problem, I can only get around 20-30 tokens in the response. I wonder if it's model internal limitation

I have the exact same problem, I can only get around 20-30 tokens in the response. I wonder if it's model internal limitation

Please let me know if you find a solution.

This comment has been hidden

Can you attach an image of the code you are using for generation? Its working fine for me.

Can you also try with model.generate(**inputs, max_new_tokens = 350) the default is 20 which might explain what's happening.

You can make POST request with inputs (and your query) and add "parameters" field. Increase max_new_tokens amount to get more text.

{
                "inputs": inputs,
                "parameters": {
                    "max_new_tokens": 100,
                    "temperature": 0.5,
                    "top_k": 40,
                    "top_p": 0.95,
                    "repetition_penalty": 1.1
                }
            }
``

Sign up or log in to comment