Instructions to use nowhereman14/llama3-8b-react-distilled-70b with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- PEFT
How to use nowhereman14/llama3-8b-react-distilled-70b with PEFT:
from peft import PeftModel from transformers import AutoModelForCausalLM base_model = AutoModelForCausalLM.from_pretrained("meta-llama/Llama-3.1-8B-Instruct") model = PeftModel.from_pretrained(base_model, "nowhereman14/llama3-8b-react-distilled-70b") - Notebooks
- Google Colab
- Kaggle
React-FT-8B
A LoRA fine-tune of meta-llama/Llama-3.1-8B-Instruct that performs ReAct-style agentic reasoning over task-oriented dialogue, distilled from a larger ICL-prompted teacher model (meta-llama/Llama-3.3-70B-Instruct).
Model Description
This model implements an adaptation of the ReAct framework tailored to the MultiWOZ dataset. It is designed to actively retrieve information about relevant instances through a REST API, while preventing access to unrelated information. The system receives the instruction prompt, the dialogue history and the user’s query as inputs and generates the final response through an iterative sequence of thoughts, actions, and observations. The actions requires the use of these tools:
• Look[]: returns the metadata of all the instances of the selected travel scene.
• Search[]: retrieves the instances in the scene that satisfy a specific query.
• Finish[answer]: an action used by the model to indicate the final response that the system returns
This specific checkpoint (React FT 8B) is the result of an LLM-distillation process. A series of reasoning trajectories were generated using larger model (React ICL 70B) via In-Context Learning (ICL) inference.
These trajectories were filtered using an LLM-based judge (Prometheus) to construct a high-quality training set. This curated dataset was used to fine-tune Llama-3.1-8B-Instruct via LoRA (PEFT), producing a smaller, more efficient model that outperforms the reasoning quality of its larger teacher.
- Downloads last month
- 61
Model tree for nowhereman14/llama3-8b-react-distilled-70b
Base model
meta-llama/Llama-3.1-8B