Open-Orca
/

OpenOrca-Preview1-13B

Text Generation

text-generation-inference

Inference Endpoints

Model card Files Files and versions Community

eugenepentland commited on Jul 12, 2023

Commit

6ff0068

·

1 Parent(s): 3226af0

Update README.md

Files changed (1) hide show

README.md +3 -0

README.md CHANGED Viewed

@@ -17,6 +17,9 @@ datasets:
 We have used our own [OpenOrca dataset](https://huggingface.co/datasets/Open-Orca/OpenOrca) to fine-tune LLaMA-13B.
 This dataset is our attempt to reproduce the dataset generated for Microsoft Research's [Orca Paper](https://arxiv.org/abs/2306.02707).
 We have trained on less than 6% of our data, just to give a preview of what is possible while we further refine our dataset!
 We trained a refined selection of 200k GPT-4 entries from OpenOrca.
 We have filtered our GPT-4 augmentations to remove statements like, "As an AI language model..." and other responses which have been shown to harm model reasoning capabilities. Further details on our dataset curation practices will be forthcoming with our full model releases.

 We have used our own [OpenOrca dataset](https://huggingface.co/datasets/Open-Orca/OpenOrca) to fine-tune LLaMA-13B.
 This dataset is our attempt to reproduce the dataset generated for Microsoft Research's [Orca Paper](https://arxiv.org/abs/2306.02707).
+Want to visualize our dataset? Check out our [Nomic Atlas Map.](https://atlas.nomic.ai/map/c1b88b47-2d9b-47e0-9002-b80766792582/2560fd25-52fe-42f1-a58f-ff5eccc890d2)
+  [<img src="https://i.ibb.co/vdd1XQg/image.png" alt="Atlas Nomic Dataset Map" width="400" height="400" />](https://atlas.nomic.ai/map/c1b88b47-2d9b-47e0-9002-b80766792582/2560fd25-52fe-42f1-a58f-ff5eccc890d2)
 We have trained on less than 6% of our data, just to give a preview of what is possible while we further refine our dataset!
 We trained a refined selection of 200k GPT-4 entries from OpenOrca.
 We have filtered our GPT-4 augmentations to remove statements like, "As an AI language model..." and other responses which have been shown to harm model reasoning capabilities. Further details on our dataset curation practices will be forthcoming with our full model releases.