zolicsaki (Zoltan Csaki)

Posts 5

Post

1308

We’ve open-sourced an app, powered by SambaNova Cloud and Llama 405B, that intelligently detects when a web search is needed—then answers directly or with RAG.

sambanovasystems/auto-web-search

🥚 A hidden Easter egg is that Auto Search detection is already trained into Llama 3.1 checkpoints. Simply use the tool usage system prompt below, and the model will either respond with a web search query if it deems necessary or respond to the query directly.🥚

Environment: IPython
Tools: Brave Search
Knowledge Cutoff Date: December 2023
Today's Date: September 2024
You are a helpful assistant. Reminder:
Search function calls MUST follow the specified format: "brave_search.call(query)"

You can see the documentation here
https://www.llama.com/docs/model-cards-and-prompt-formats/llama3_1#built-in-tooling
and read about how the tool usage was trained into Llama3.1 models in section 4.3.5 here https://arxiv.org/pdf/2407.21783

Post

1339

Fast inference is no longer a nice-to-have demo; it will be the driving force behind future frontier models. Time to switch over to custom AI hardware and short Nvidia.

Try out SambaNova's lightning fast API for free at https://sambanova.ai/fast-api?api_ref=444868

View all Posts

Zoltan Csaki

AI & ML interests

Recent Activity

Organizations

Posts 5

Papers 2

models 0

datasets 0