File size: 4,251 Bytes

cab4c81
 
 
 
 
 
 
 
47cab7e
0645a00
cab4c81
 
874d207
cab4c81
 
 
 
 
 
ea65e15
4b4b799
 
 
 
 
ea65e15
 
08a74a7
97fd03a
cab4c81
2076494
 
a7a357b
2076494
a7a357b
2076494
 
 
cab4c81
 
 
 
7e2f8d4
2076494
cab4c81
 
 
 
 
4b4b799
 
7e2f8d4
cab4c81
 
 
 
 
 
 
5a86ab7
cab4c81
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1c10fe2
cab4c81
 
 
 
 
 
 
5a86ab7
cab4c81
c086cc0
cab4c81

---
license: apache-2.0
tags:
- function-calling
---

# Fireworks Function Calling (FireFunction) Model V1

<img src="https://cdn-uploads.huggingface.co/production/uploads/64b6f3a72f5a966b9722de88/12mfdeAJzW1NdKrN_J--L.png" alt="firefunction" width="400"/>

FireFunction is a state-of-the-art function calling model with a commercially viable license.

💡 The model is also hosted on the [Fireworks](https://fireworks.ai/models/fireworks/firefunction-v1) platform. It's free during a limited beta period and hosted to be about 4x the speed of GPT-4, generating ~110 tokens/sec! The model is also API compatible with [OpenAI function calling](https://platform.openai.com/docs/guides/function-calling).
```sh
OPENAI_API_BASE=https://api.fireworks.ai/inference/v1
OPENAI_API_KEY=<YOUR_FIREWORKS_API_KEY>
MODEL=accounts/fireworks/models/firefunction-v1
```

## Resources
* [FireFunction-v1 Blog Post](https://fireworks.ai/blog)
* [Fireworks discord with function calling channel](https://discord.gg/mMqQxvFD9A)
* [Documentation](https://readme.fireworks.ai/docs/function-calling)
* [UI Demo app](https://functional-chat.vercel.app/)
* [Try in Fireworks prompt playground UI](https://fireworks.ai/models/fireworks/firefunction-v1)



# Intended Use and Limitations

### Key Highlights
⭐️ Near GPT-4 level quality for real-world use cases of structured information generation and routing decision-making

💨 Blazing fast speed. Inference speeds are roughly 4x that of GPT-4 when using FireFunction hosted on the Fireworks platform

🔄 Support for "any" paramter in tool_choice. Firefunction is the only model that we're aware that supports an option for the model to always choose a function - particularly helpful for routing use cases 


### Primary Use
Although the model was trained on a variety of tasks, it performs best on:
 * single-turn request routing to a function picked from a pool of up to 20 function specs.
 * structured information extraction.
See [blog post](https://fireworks.ai/blog) for more info on FireFunction.

### Out-of-Scope Use
The model was not optimized for the following use cases:
  * general multi-turn chat,
  * parallel and nested function calls in a single response. These can be broken into multiple messages.

## Example Usage

See [documentation](https://readme.fireworks.ai/docs/function-calling) for more detail.

```python
from transformers import AutoModelForCausalLM, AutoTokenizer
import json

device = "cuda" # the device to load the model onto

model = AutoModelForCausalLM.from_pretrained("fireworks-ai/firefunction-v1", device_map="auto")
tokenizer = AutoTokenizer.from_pretrained("fireworks-ai/firefunction-v1")

function_spec = [
    {
        "name": "get_stock_price",
        "description": "Get the current stock price",
        "parameters": {
            "type": "object",
            "properties": {
                "symbol": {
                    "type": "string",
                    "description": "The stock symbol, e.g. AAPL, GOOG"
                }
            },
            "required": [
                "symbol"
            ]
        }
    },
    {
        "name": "check_word_anagram",
        "description": "Check if two words are anagrams of each other",
        "parameters": {
            "type": "object",
            "properties": {
                "word1": {
                    "type": "string",
                    "description": "The first word"
                },
                "word2": {
                    "type": "string",
                    "description": "The second word"
                }
            },
            "required": [
                "word1",
                "word2"
            ]
        }
    }
]
functions = json.dumps(function_spec, indent=4)

messages = [
    {'role': 'functions', 'content': functions},
    {'role': 'system', 'content': 'You are a helpful assistant with access to functions. Use them if required.'},
    {'role': 'user', 'content': 'Hi, can you tell me the current stock price of AAPL?'}
]

model_inputs = tokenizer.apply_chat_template(messages, return_tensors="pt").to(model.device)

generated_ids = model.generate(model_inputs, max_new_tokens=128)
decoded = tokenizer.batch_decode(generated_ids)
print(decoded[0])
```