Nexusflow
/

NexusRaven-V2-13B

Text Generation

function calling

text-generation-inference

Model card Files Files and versions Community

use of <bot_end>

#8

by nzaveri - opened Dec 18, 2023

nzaveri

Dec 18, 2023

Hi,
Excellent work on function calling. However, how can I use to save on inference speed and tokens?

result = pipeline(prompt, max_new_tokens=2048, stop = "", return_full_text=False, do_sample=False, temperature=0.001)[0]["generated_text"]
print (result)

Pipeline is:
pipeline = pipeline(
"text-generation",
model="Nexusflow/NexusRaven-V2-13B",
torch_dtype="auto",
device_map="auto",
)

Error:
ValueError: The following model_kwargs are not used by the model: ['stop'] (note: typos in the generate arguments will also show up in this list)

venkat-srinivasan-nexusflow

Nexusflow org Dec 18, 2023

•

edited Dec 18, 2023

Thank you for your interest in the model! There's a couple ways you can implement this. The easiest is to just use TGI, as it accepts a stopping criteria as one of the arguments in the payload. You might be able to spin this up and just sent REST-like requests to the endpoint with a stopping criteria in the parameter dict in your payload. For text generation pipeline, I don't believe there's an easy implementation for stopping criteria. You'll likely have to implement a StoppingCriteriaList that gets a StoppingCriteria passed in (where you'll specify "<bot_end>" in its tokenized form). Something like this: https://huggingface.co/stabilityai/stablelm-tuned-alpha-3b/commit/072102d1d3462d9b2e18d91f4d22e894d83e7ccf

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

Your need to confirm your account before you can post a new comment.

· Sign up or log in to comment