Function calling

#6
by denk64 - opened

Hello,

I've been quite satisfied with this model and the ease of programming with it, just droped it in my mistral api and it just worked.

It can even perform good function calls with no change in the prompt until I get to 5+ functions, then it starts using few shot examples as answers and hallucinating.

Is there a proper way for it to do function calls, cuz ive been using chat prompts with examples for it to answer in json?

Thank you for it.

I used The Blokes AWQ quant with vllm on my home 2x 3060 12GB gpu server to get 25-27 token/s.

Thank you for your feedback I’m glad to hear it’s been working well.

As for the function calling, this model has not been explicitly fine tuned to call functions. As a result it’s likely to get lost in complex interactions.

I’ve been looking into fine tuning explicitly for function calling and perhaps a future version may be more capable in that regard.

Thanks again for the interest!

@macadeliccc I can only say go for it, there are some function calling datasets out there and I'm sure it would work very good with this setup.

since as i said, it already worked very good with 1 parameter and no parameter functions just from 1 example i gave it as a multi turn chat.
Im gonna continue using this model and tinker with it to see how well can it perform, im just gonna train a distilbert classifier to assign different sets of functions depending on the prompt.
I think having an LLM specifically just for function calls is waste of resources since general purpose can be steered towards function calls with prompts and can be assisted with a separate tiny classifier model that can be trained for those purposes very easy and on the fly so maybe that's something to think about.

My little project where I used that kind of setup was a Discord chat bot that could go to wowhead.com and scrape quest, item and npc information on the fly (RAG) and a mistral model could then answer specific question about those things and I had a prompt selector which was a DistilBERT model trained by gpt4 synthetic data to be able to recognize what the user wants (chat, code, prompt about quest, item or npc).

The bot was on a small discord server with around 100 users and around 10-15 users using it, distilbert was managing the prompt selections under 20ms and mistral output was around 45 tokens/s on my home hardware, old Xeon with 64gb ram and 2x 3060 12GB + 1080 8GB gpus.

this is the output of your model for a function that does google search, it isn't a conventional function call but it worked for me. You can see there are some things missing, like type and param number but i wanted to keep it simple.

function template:
        'google_search': {
            'parameters': ['query'],
            'description': 'Searches google for the query and returns the results',
            'usage_example': """if user says: Can you google how to make a pie? You output a json with the function name and parameters like this:
json
'''
{
"function": "google_search",
"parameters": ["How to make a delicious pie?"]
}
'''

prompt: can you google the actor who played picard in star trek

your llm output:
{
"function": "google_search",
"parameters": ["Patrick Stewart Jean-Luc Picard"]
}

as you can see it worked rly good.

Edit: The only issue i had was that it didn't want to say that it doesn't know something or that a specific function does not exist, tried to kill some kittens so it feels bad but it didn't work ^^ .. but that's somethin maybe a classifier can solve.

Thats very cool. I will certainly be experimenting with this more in the future.

If youre interested, you could make a pull request with your function calling usage to let others know that it works well in this scenario.

@macadeliccc There is a easter egg I noticed in your model, sometimes when it hallucinates it spits out a json output like this: {'function': 'play_youtube_video', 'parameters': ['https://www.youtube.com/watch?v=dQw4w9WgXcQ']}

if you open the link with youtube, you're gonna laugh.

First time i saw that i was thinking that maybe I've used it as an example but it occurred multiple times when it hallucinated.

what's funny about it, my app does have that function which takes youtube link as parameter so i can hear that in background hahaha..

So the model unintentionally rick rolls users during function calling? That's a genuinely funny easter egg

Sign up or log in to comment