---
datasets:
- databricks/databricks-dolly-15k
language:
- en
---

# Open-Instruct Dolly 7B

This model is a 7B LLaMa model finetuned on the Dolly dataset. *please note this is a model diff - see below for usage instructions*.

This was trained as part of the paper [How Far Can Camels Go? Exploring the State of Instruction Tuning on Open Resources](arxiv.org/abs/xxxx).
The codebase used to train and evaluate this model can be found at [https://github.com/allenai/open-instruct](https://github.com/allenai/open-instruct).

This model is licensed under a modified LlaMa license, see License.txt for details.

## Usage

We assume you have access to a LLaMa model in HF format already. You can find details on getting access and converting the model here:
[https://huggingface.co/docs/transformers/main/model_doc/llama](https://huggingface.co/docs/transformers/main/model_doc/llama)

Clone [https://github.com/allenai/open-instruct](https://github.com/allenai/open-instruct) and install the required dependencies, or just copy `scripts/weight_diff.py`
and install the minimal requirements listed in `weight-diff-requirements.txt`. Then download or clone this model diff to the same machine.

Then, run:
```bash
python scripts/weight_diff.py recover --path_raw ${hf_llama_path} --path_tuned ${output_path} --path_diff ${diff_location}
```

And you will have a recovered model! Note this takes up a decent amount of RAM, especially for the larger models.

## Input Format

The model is trained to use the following format:
```
<|user|>
Your message here!
<|assistant|>
```

For best results, format all inputs in this manner.

## Performance

Here is the performance of this model across benchmarks explored in our paper [How Far Can Camels Go? Exploring the State of Instruction Tuning on Open Resources](arxiv.org/abs/xxxx):

| MMLU 0-shot | MMLU 5-shot | GSM Direct | GSM CoT | BBH Direct | BBH CoT | TydiQA Gold-Passage | TydiQA Closed-book | Codex-Eval Pass@1 | Codex-Eval Pass@10 | AlpacaFarm vs Davinci-003 | Average |
|:-----------:|:-----------:|:----------:|:-------:|:----------:|:-------:|:-------------------:|:------------------:|:-----------------:|:------------------:|:-------------------------:|---------|
|    38.0    |    35.8    |    5.0   |  7.0  |    27.2   |  24.4  |        43.6       |        8.7       |       11.1       |        22.1       |           12.7           | 20.7    |


If you use this model, please cite our work, the llama paper, and the original dataset:

```
@article{camelevaluation,
  title={How Far Can Camels Go? Exploring the State of Instruction Tuning on Open Resources},
  author={Yizhong Wang, Hamish Ivison, Pradeep Dasigi, Jack Hessel, Tushar Khot, Khyathi Raghavi Chandu, David Wadden, Kelsey MacMillan, Noah A. Smith, Iz Beltagy, Hannaneh Hajishirzi},
  year={2023}
}
```

```
@misc{touvron2023llama,
      title={LLaMA: Open and Efficient Foundation Language Models}, 
      author={Hugo Touvron and Thibaut Lavril and Gautier Izacard and Xavier Martinet and Marie-Anne Lachaux and Timothée Lacroix and Baptiste Rozière and Naman Goyal and Eric Hambro and Faisal Azhar and Aurelien Rodriguez and Armand Joulin and Edouard Grave and Guillaume Lample},
      year={2023},
      eprint={2302.13971},
      archivePrefix={arXiv},
      primaryClass={cs.CL}
}
```

```
@misc{dolly,
  author = {Databricks},
  title = {Free Dolly: Introducing the World's First Truly Open Instruction-Tuned LLM},
  year = {2023},
  publisher = {GitHub},
  journal = {GitHub repository},
  howpublished = {Blog post},
  url = {https://www.databricks.com/blog/2023/04/12/dolly-first-open-commercially-viable-instruction-tuned-llm}
}
```