Open-Orca
/

oo-phi-1_5

@@ -11,14 +11,25 @@ pipeline_tag: text-generation
 Unreleased, untested, unfinished beta.
 # Evaluations
 We've only done very limited testing as yet. The [epoch 4.5 checkpoint](https://huggingface.co/Open-Orca/oo-phi-1_5/commit/aa05eb2596d6d11951695d2e327616188d768880) scores above 5 on MT-Bench (better than Alpaca-13B, worse than Llama2-7b-chat), while preliminary benchmarks suggest peak average performance was achieved roughly at epoch 4.
 ![HF Leaderboard](https://huggingface.co/Open-Orca/oo-phi-1_5/resolve/main/Images/oo-phi-1_5-HFLeaderboard.png)
-MT-bench Epoch 4.5 result:
 ```
 Mode: single
 Input file: data/mt_bench/model_judgment/gpt-4_single.jsonl
@@ -39,6 +50,7 @@ model
 oo-phi-1_5  5.03125
 ```
 # Training
 Trained with full-paramaters fine-tuning on 8x RTX A6000-48GB (Ampere) for 5 epochs for 62 hours (12.5h/epoch) at a commodity cost of $390 ($80/epoch).
@@ -47,6 +59,12 @@ We did not use [MultiPack](https://github.com/imoneoi/multipack_sampler) packing
 [<img src="https://raw.githubusercontent.com/OpenAccess-AI-Collective/axolotl/main/image/axolotl-badge-web.png" alt="Built with Axolotl" width="200" height="32"/>](https://github.com/OpenAccess-AI-Collective/axolotl)
 # Inference
 Remove *`.to('cuda')`* for unaccelerated.
@@ -123,4 +141,40 @@ In terms of programming tasks, I am particularly skilled in:
 10. Fraud Detection: I can detect and prevent fraudulent activities, protecting users' financial information and ensuring secure transactions.
 These programming tasks showcase my ability to understand and process vast amounts of information while adapting to different contexts and user needs. As an AI, I continuously learn and evolve to become even more effective in assisting users.<|im_end|>
 ```

 Unreleased, untested, unfinished beta.
+We've trained Microsoft Research's [phi-1.5](https://huggingface.co/microsoft/phi-1_5), 1.3B parameter model with the same OpenOrca dataset as we used with our [OpenOrcaxOpenChat-Preview2-13B](https://huggingface.co/Open-Orca/OpenOrcaxOpenChat-Preview2-13B) model.
+This model doesn't dramatically improve on the base model's general task performance, but the instruction tuning has made the model reliably handle the ChatML prompt format.
 # Evaluations
 We've only done very limited testing as yet. The [epoch 4.5 checkpoint](https://huggingface.co/Open-Orca/oo-phi-1_5/commit/aa05eb2596d6d11951695d2e327616188d768880) scores above 5 on MT-Bench (better than Alpaca-13B, worse than Llama2-7b-chat), while preliminary benchmarks suggest peak average performance was achieved roughly at epoch 4.
+## HuggingFaceH4 Open LLM Leaderboard Performance
+The only significant improvement was with TruthfulQA.
 ![HF Leaderboard](https://huggingface.co/Open-Orca/oo-phi-1_5/resolve/main/Images/oo-phi-1_5-HFLeaderboard.png)
+## MT-bench Performance
+Epoch 4.5 result:
 ```
 Mode: single
 Input file: data/mt_bench/model_judgment/gpt-4_single.jsonl
 oo-phi-1_5  5.03125
 ```
 # Training
 Trained with full-paramaters fine-tuning on 8x RTX A6000-48GB (Ampere) for 5 epochs for 62 hours (12.5h/epoch) at a commodity cost of $390 ($80/epoch).
 [<img src="https://raw.githubusercontent.com/OpenAccess-AI-Collective/axolotl/main/image/axolotl-badge-web.png" alt="Built with Axolotl" width="200" height="32"/>](https://github.com/OpenAccess-AI-Collective/axolotl)
+# Prompt Template
+We used [OpenAI's Chat Markup Language (ChatML)](https://github.com/openai/openai-python/blob/main/chatml.md) format, with `<|im_start|>` and `<|im_end|>` tokens added to support this.
+This means that, e.g., in [oobabooga](https://github.com/oobabooga/text-generation-webui/) the `MPT-Chat` instruction template should work.
 # Inference
 Remove *`.to('cuda')`* for unaccelerated.
 10. Fraud Detection: I can detect and prevent fraudulent activities, protecting users' financial information and ensuring secure transactions.
 These programming tasks showcase my ability to understand and process vast amounts of information while adapting to different contexts and user needs. As an AI, I continuously learn and evolve to become even more effective in assisting users.<|im_end|>
+```
+# Citation
+```bibtex
+@software{lian2023oophi15,
+  title = {OpenOrca oo-phi-1.5: Phi-1.5 1.3B Model Instruct-tuned on Filtered OpenOrcaV1 GPT-4 Dataset},
+  author = {Wing Lian and Bleys Goodson and Guan Wang and Eugene Pentland and Austin Cook and Chanvichet Vong and "Teknium"},
+  year = {2023},
+  publisher = {HuggingFace},
+  journal = {HuggingFace repository},
+  howpublished = {\url{https://huggingface.co/Open-Orca/oo-phi-1_5},
+}
+@article{textbooks2,
+  title={Textbooks Are All You Need II: \textbf{phi-1.5} technical report},
+  author={Li, Yuanzhi and Bubeck, S{\'e}bastien and Eldan, Ronen and Del Giorno, Allie and Gunasekar, Suriya and Lee, Yin Tat},
+  journal={arXiv preprint arXiv:2309.05463},
+  year={2023}
+}
+@misc{mukherjee2023orca,
+      title={Orca: Progressive Learning from Complex Explanation Traces of GPT-4},
+      author={Subhabrata Mukherjee and Arindam Mitra and Ganesh Jawahar and Sahaj Agarwal and Hamid Palangi and Ahmed Awadallah},
+      year={2023},
+      eprint={2306.02707},
+      archivePrefix={arXiv},
+      primaryClass={cs.CL}
+}
+@misc{longpre2023flan,
+      title={The Flan Collection: Designing Data and Methods for Effective Instruction Tuning},
+      author={Shayne Longpre and Le Hou and Tu Vu and Albert Webson and Hyung Won Chung and Yi Tay and Denny Zhou and Quoc V. Le and Barret Zoph and Jason Wei and Adam Roberts},
+      year={2023},
+      eprint={2301.13688},
+      archivePrefix={arXiv},
+      primaryClass={cs.AI}
+}
 ```