Undi95 commited on
Commit
b04f4e8
1 Parent(s): b4086d2

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +42 -7
README.md CHANGED
@@ -24,21 +24,56 @@ As some people have told us our models are sloppy, Ikari decided to say fuck it
24
 
25
  Our dataset stayed the same since day one, we added data over time, cleaned them, and repeat. After not releasing model for a while because we were never satisfied, we think it's time to come back!
26
 
 
 
 
 
 
27
 
28
  ## Credits:
29
  - Undi
30
  - IkariDev
31
 
32
- ## Training data used:
33
- We will point out all dataset we used here, please be patient the time we get them all back kek.
34
 
35
- Temporary credit for the following madlads, who contributed to the datasets we have build over time: Gryphe, Caitlyn, Kalomaze, Gifted Gummy Bee, Sao [...]
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
36
 
37
- # Prompt template: Mistral
38
 
39
- ```
40
- <s>[INST] {input} [/INST] {output}</s>
41
- ```
 
 
 
 
 
 
 
 
 
 
 
 
 
42
 
43
  ## Others
44
 
 
24
 
25
  Our dataset stayed the same since day one, we added data over time, cleaned them, and repeat. After not releasing model for a while because we were never satisfied, we think it's time to come back!
26
 
27
+ # Prompt template: Mistral
28
+
29
+ ```
30
+ <s>[INST] {input} [/INST] {output}</s>
31
+ ```
32
 
33
  ## Credits:
34
  - Undi
35
  - IkariDev
36
 
37
+ ## Training data we used to make our dataset:
 
38
 
39
+ - [Epiculous/Gnosis](https://huggingface.co/Epiculous/Gnosis)
40
+ - [ChaoticNeutrals/Luminous_Opus](https://huggingface.co/datasets/ChaoticNeutrals/Luminous_Opus)
41
+ - [ChaoticNeutrals/Synthetic-Dark-RP](https://huggingface.co/datasets/ChaoticNeutrals/Synthetic-Dark-RP)
42
+ - [ChaoticNeutrals/Synthetic-RP](https://huggingface.co/datasets/ChaoticNeutrals/Synthetic-RP)
43
+ - [Gryphe/Sonnet3.5-SlimOrcaDedupCleaned](https://huggingface.co/datasets/Gryphe/Sonnet3.5-SlimOrcaDedupCleaned)
44
+ - [Gryphe/Opus-WritingPrompts](https://huggingface.co/datasets/Gryphe/Opus-WritingPrompts)
45
+ - [meseca/writing-opus-6k](https://huggingface.co/datasets/meseca/writing-opus-6k)
46
+ - [meseca/opus-instruct-9k](https://huggingface.co/datasets/meseca/opus-instruct-9k)
47
+ - [PJMixers/grimulkan_theory-of-mind-ShareGPT](https://huggingface.co/datasets/PJMixers/grimulkan_theory-of-mind-ShareGPT)
48
+ - [NobodyExistsOnTheInternet/ToxicQAFinal](https://huggingface.co/datasets/NobodyExistsOnTheInternet/ToxicQAFinal)
49
+ - [Undi95/toxic-dpo-v0.1-sharegpt](https://huggingface.co/datasets/Undi95/toxic-dpo-v0.1-sharegpt)
50
+ - [cgato/SlimOrcaDedupCleaned](https://huggingface.co/datasets/cgato/SlimOrcaDedupCleaned)
51
+ - [kalomaze/Opus_Instruct_25k](https://huggingface.co/datasets/kalomaze/Opus_Instruct_25k)
52
+ - [Doctor-Shotgun/no-robots-sharegpt](https://huggingface.co/datasets/Doctor-Shotgun/no-robots-sharegpt)
53
+ - [Norquinal/claude_multiround_chat_30k](https://huggingface.co/datasets/Norquinal/claude_multiround_chat_30k)
54
+ - [nothingiisreal/Claude-3-Opus-Instruct-15K](https://huggingface.co/datasets/nothingiisreal/Claude-3-Opus-Instruct-15K)
55
+ - All the Aesirs dataset, cleaned, unslopped
56
+ - All le luminae dataset, cleaned, unslopped
57
+ - Small part of Airoboros reduced
58
 
59
+ We sadly didn't find the sources of the following, DM us if you recognize your set !
60
 
61
+ - Opus_Instruct-v2-6.5K-Filtered-v2-sharegpt
62
+ - claude_sharegpt_trimmed
63
+ - CapybaraPure_Decontaminated-ShareGPT_reduced
64
+
65
+ ## Datasets credits:
66
+ - Epiculous
67
+ - ChaoticNeutrals
68
+ - Gryphe
69
+ - meseca
70
+ - PJMixers
71
+ - NobodyExistsOnTheInternet
72
+ - cgato
73
+ - kalomaze
74
+ - Doctor-Shotgun
75
+ - Norquinal
76
+ - nothingiisreal
77
 
78
  ## Others
79