Triangle104 commited on
Commit
75f36b8
·
verified ·
1 Parent(s): 2d9ad19

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +90 -0
README.md CHANGED
@@ -26,6 +26,96 @@ model-index:
26
  This model was converted to GGUF format from [`EVA-UNIT-01/EVA-Qwen2.5-32B-v0.2`](https://huggingface.co/EVA-UNIT-01/EVA-Qwen2.5-32B-v0.2) using llama.cpp via the ggml.ai's [GGUF-my-repo](https://huggingface.co/spaces/ggml-org/gguf-my-repo) space.
27
  Refer to the [original model card](https://huggingface.co/EVA-UNIT-01/EVA-Qwen2.5-32B-v0.2) for more details on the model.
28
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
29
  ## Use with llama.cpp
30
  Install llama.cpp through brew (works on Mac and Linux)
31
 
 
26
  This model was converted to GGUF format from [`EVA-UNIT-01/EVA-Qwen2.5-32B-v0.2`](https://huggingface.co/EVA-UNIT-01/EVA-Qwen2.5-32B-v0.2) using llama.cpp via the ggml.ai's [GGUF-my-repo](https://huggingface.co/spaces/ggml-org/gguf-my-repo) space.
27
  Refer to the [original model card](https://huggingface.co/EVA-UNIT-01/EVA-Qwen2.5-32B-v0.2) for more details on the model.
28
 
29
+ ---
30
+ Model details
31
+ -
32
+
33
+ A RP/storywriting specialist model, full-parameter finetune of Qwen2.5-32B on mixture of synthetic and natural data.
34
+
35
+ It uses Celeste 70B 0.1 data mixture, greatly expanding it to improve
36
+ versatility, creativity and "flavor" of the resulting model.
37
+
38
+
39
+
40
+
41
+
42
+ Dedicated to Nev.
43
+
44
+
45
+
46
+ Version notes for 0.2: Basically, reprocessed the whole
47
+ dataset again, due to a severe mistake in previously used pipeline,
48
+ which left the data poisoned with a lot of non-unicode characters. Now,
49
+ no more weird generation artifacts, and more stability. Major kudos to
50
+ Cahvay for his work on fixing this critical issue.
51
+
52
+
53
+
54
+
55
+
56
+ Prompt format is ChatML.
57
+
58
+
59
+
60
+ Recommended sampler values:
61
+
62
+
63
+ Temperature: 1
64
+ Min-P: 0.05
65
+ Top-A: 0.2
66
+ Repetition Penalty: 1.03
67
+
68
+
69
+
70
+ Recommended SillyTavern presets (via CalamitousFelicitousness):
71
+
72
+
73
+
74
+ Context
75
+ Instruct and System Prompt
76
+
77
+
78
+
79
+
80
+
81
+
82
+
83
+
84
+ Training data:
85
+
86
+
87
+
88
+ Celeste 70B 0.1 data mixture minus Opus Instruct subset. See that model's card for details.
89
+ Kalomaze's Opus_Instruct_25k dataset, filtered for refusals.
90
+ A subset (1k rows) of ChatGPT-4o-WritingPrompts by Gryphe
91
+ A subset (2k rows) of Sonnet3.5-Charcards-Roleplay by Gryphe
92
+ Synthstruct and SynthRP datasets by Epiculous
93
+ A subset from Dolphin-2.9.3, including filtered version of not_samantha and a small subset of systemchat.
94
+
95
+
96
+
97
+ Training time and hardware:
98
+
99
+
100
+
101
+ 7 hours on 8xH100 SXM, provided by FeatherlessAI
102
+
103
+
104
+
105
+
106
+
107
+
108
+ Model was created by Kearm, Auri and Cahvay.
109
+
110
+
111
+ Special thanks:
112
+ to Cahvay for his work on investigating and reprocessing the
113
+ corrupted dataset, removing the single biggest source of data poisoning.
114
+ to FeatherlessAI for generously providing 8xH100 SXM node for training of this model
115
+ to Gryphe, Lemmy, Kalomaze, Nopm, Epiculous and CognitiveComputations for the data
116
+ and to Allura-org for support, feedback, beta-testing and doing quality control of EVA models.
117
+
118
+ ---
119
  ## Use with llama.cpp
120
  Install llama.cpp through brew (works on Mac and Linux)
121