File size: 4,762 Bytes
3841c9f
 
56a7af3
 
 
 
 
3841c9f
 
4a6759c
3841c9f
56a7af3
3841c9f
 
 
 
 
 
 
4a6759c
 
3841c9f
4a6759c
3841c9f
4a6759c
3841c9f
4a6759c
 
 
 
3841c9f
 
 
4a6759c
3841c9f
4a6759c
3841c9f
4a6759c
 
 
 
3841c9f
4a6759c
3841c9f
4a6759c
 
 
 
 
 
 
 
 
3841c9f
4a6759c
 
 
3841c9f
 
56a7af3
 
 
 
 
 
463b96d
56a7af3
463b96d
56a7af3
463b96d
56a7af3
 
463b96d
 
 
 
 
 
 
 
 
 
 
 
 
 
 
56a7af3
 
463b96d
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
56a7af3
463b96d
 
 
56a7af3
 
463b96d
 
56a7af3
 
463b96d
56a7af3
 
 
e51ac35
 
 
 
 
 
 
 
 
 
 
 
56a7af3
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
---
library_name: transformers
license: apache-2.0
datasets:
- THUDM/AgentInstruct
language:
- en
---

# Mistral-7B fine-tuned on AgentInstruct

[Mistral-7b-v1.0]() fine-tuned on the dataset [AgentInstruct](https://huggingface.co/datasets/THUDM/AgentInstruct) for "*better* acting as an agent"



## Model Details

### Model Description

The Mistral-7B-v0.1 Large Language Model (LLM) is a pretrained generative text model with 7 billion parameters. 
Mistral-7B-v0.1 outperforms Llama 2 13B on all benchmarks we tested.

For full details of this model please read our [paper](https://arxiv.org/abs/2310.06825) and [release blog post](https://mistral.ai/news/announcing-mistral-7b/).

## Model Architecture

Mistral-7B-v0.1 is a transformer model, with the following architecture choices:
- Grouped-Query Attention
- Sliding-Window Attention
- Byte-fallback BPE tokenizer



## Dataset Details

**AgentInstruct** is a meticulously curated dataset featuring **1,866** high-quality interactions, designed to enhance AI agents across six diverse real-world tasks, leveraging innovative methods like **Task Derivation** and **Self-Instruct**.

- ๐Ÿ” **CoT** - Harness the power of [ReAct](https://react-lm.github.io/), offering detailed thought explanations for each action, ensuring an intricate understanding of the model's decision-making journey.
- ๐ŸŒ **Diversity** - Spanning 6 real-world scenarios, from Daily Household Routines to Database Operations, and their average turns range from 5 to 35.
- ๐ŸŽฏ **Precision** - Not all trajectories of GPT-4 are effective! Ours are rigorously filtered using strict rewards to ensure top-notch quality.
- โœ… **Assurance** - Rigorous checks to avoid data leakage, ensuring pristine dataset quality.

## Task Overview

| Task | # Filt. Traj. | Avg # Filt. Traj. Turns |
|---|---|---|
|ALFWorld|336|13.52|
|WebShop|351|3.68|
|Mind2Web|122|1.00|
|Knowledge Graph|324|6.04|
|Operating System|195|3.85|
|Database|538|2.06|
|**AgentInstruct**|1866|5.24|

AgentInstruct includes 1,866 trajectories from
6 agents tasks. "Traj." stands for interaction trajectory. "Filt. Traj."
stands for filtered trajectories. 

## Training Details
TBD

## Example of usage

```py
from transformers import AutoTokenizer, AutoModelForCausalLM, StoppingCriteria
import torch

# Load tokenizer and model
tokenizer = AutoTokenizer.from_pretrained("mrm8488/mistral-7b-ft-AgentInstruct")
model = AutoModelForCausalLM.from_pretrained("mrm8488/mistral-7b-ft-AgentInstruct").to("cuda")

class MyStoppingCriteria(StoppingCriteria):
    def __init__(self, target_sequence, prompt):
        self.target_sequence = target_sequence
        self.prompt = prompt

    def __call__(self, input_ids, scores, **kwargs):
        # Decode without prompt and check for target sequence
        generated_text = tokenizer.decode(input_ids[0]).replace(self.prompt, '')
        return self.target_sequence in generated_text

    def __len__(self):
        return 1

def generate(context, max_new_tokens=256, min_new_tokens=64, temperature=0.3, top_p=0.75, top_k=40, do_sample=True, num_beams=2):
    # Prepare input data
    inputs = tokenizer(context, return_tensors="pt")
    input_ids = inputs["input_ids"].to("cuda")
    attention_mask = inputs["attention_mask"].to("cuda")

    # Generation settings
    generation_settings = {
        "max_new_tokens": max_new_tokens,
        "min_new_tokens": min_new_tokens,
        "temperature": temperature,
        "top_p": top_p,
        "top_k": top_k,
        "do_sample": do_sample,
        "num_beams": num_beams,
        "early_stopping": False,
        "use_cache": True,
        "stopping_criteria": MyStoppingCriteria("### human:", context)
    }

    # Generate response
    with torch.no_grad():
        generation_output = model.generate(input_ids, attention_mask, **generation_settings)

    output = tokenizer.decode(generation_output.sequences[0])
    return output

# Example usage
context = ""
human = """### human: Among the reference ID of under 10 who got response by marketing department, compare their education status.
There are 2 tables involved with this task. The name of the 1st table is Customers, and the headers of this table are ID,SEX,MARITAL_STATUS,GEOID,EDUCATIONNUM,OCCUPATION,age. The name of the 2nd table is Mailings1_2, and the headers of this table are REFID,REF_DATE,RESPONSE."""
context = human

solution = generate(context)
print(solution)
```

## Citation
```bibtext
@misc {manuel_romero_2024,
	author       = { {Manuel Romero} },
	title        = { mistral-7b-ft-AgentInstruct (Revision 463b96d) },
	year         = 2024,
	url          = { https://huggingface.co/mrm8488/mistral-7b-ft-AgentInstruct },
	doi          = { 10.57967/hf/1650 },
	publisher    = { Hugging Face }
}
```