eugenepentland commited on
Commit
6ff0068
1 Parent(s): 3226af0

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +3 -0
README.md CHANGED
@@ -17,6 +17,9 @@ datasets:
17
  We have used our own [OpenOrca dataset](https://huggingface.co/datasets/Open-Orca/OpenOrca) to fine-tune LLaMA-13B.
18
  This dataset is our attempt to reproduce the dataset generated for Microsoft Research's [Orca Paper](https://arxiv.org/abs/2306.02707).
19
 
 
 
 
20
  We have trained on less than 6% of our data, just to give a preview of what is possible while we further refine our dataset!
21
  We trained a refined selection of 200k GPT-4 entries from OpenOrca.
22
  We have filtered our GPT-4 augmentations to remove statements like, "As an AI language model..." and other responses which have been shown to harm model reasoning capabilities. Further details on our dataset curation practices will be forthcoming with our full model releases.
 
17
  We have used our own [OpenOrca dataset](https://huggingface.co/datasets/Open-Orca/OpenOrca) to fine-tune LLaMA-13B.
18
  This dataset is our attempt to reproduce the dataset generated for Microsoft Research's [Orca Paper](https://arxiv.org/abs/2306.02707).
19
 
20
+ Want to visualize our dataset? Check out our [Nomic Atlas Map.](https://atlas.nomic.ai/map/c1b88b47-2d9b-47e0-9002-b80766792582/2560fd25-52fe-42f1-a58f-ff5eccc890d2)
21
+ [<img src="https://i.ibb.co/vdd1XQg/image.png" alt="Atlas Nomic Dataset Map" width="400" height="400" />](https://atlas.nomic.ai/map/c1b88b47-2d9b-47e0-9002-b80766792582/2560fd25-52fe-42f1-a58f-ff5eccc890d2)
22
+
23
  We have trained on less than 6% of our data, just to give a preview of what is possible while we further refine our dataset!
24
  We trained a refined selection of 200k GPT-4 entries from OpenOrca.
25
  We have filtered our GPT-4 augmentations to remove statements like, "As an AI language model..." and other responses which have been shown to harm model reasoning capabilities. Further details on our dataset curation practices will be forthcoming with our full model releases.