Triangle104
/

EVA-Qwen2.5-32B-v0.2-Q4_K_S-GGUF

Generated from Trainer

Inference Endpoints

Model card Files Files and versions Community

Triangle104 commited on 12 days ago

Commit

75f36b8

·

verified ·

1 Parent(s): 2d9ad19

Update README.md

Files changed (1) hide show

README.md +90 -0

README.md CHANGED Viewed

@@ -26,6 +26,96 @@ model-index:
 This model was converted to GGUF format from [`EVA-UNIT-01/EVA-Qwen2.5-32B-v0.2`](https://huggingface.co/EVA-UNIT-01/EVA-Qwen2.5-32B-v0.2) using llama.cpp via the ggml.ai's [GGUF-my-repo](https://huggingface.co/spaces/ggml-org/gguf-my-repo) space.
 Refer to the [original model card](https://huggingface.co/EVA-UNIT-01/EVA-Qwen2.5-32B-v0.2) for more details on the model.
 ## Use with llama.cpp
 Install llama.cpp through brew (works on Mac and Linux)

 This model was converted to GGUF format from [`EVA-UNIT-01/EVA-Qwen2.5-32B-v0.2`](https://huggingface.co/EVA-UNIT-01/EVA-Qwen2.5-32B-v0.2) using llama.cpp via the ggml.ai's [GGUF-my-repo](https://huggingface.co/spaces/ggml-org/gguf-my-repo) space.
 Refer to the [original model card](https://huggingface.co/EVA-UNIT-01/EVA-Qwen2.5-32B-v0.2) for more details on the model.
+---
+Model details
+-
+  A RP/storywriting specialist model, full-parameter finetune of Qwen2.5-32B on mixture of synthetic and natural data.
+  It uses Celeste 70B 0.1 data mixture, greatly expanding it to improve
+versatility, creativity and "flavor" of the resulting model.
+Dedicated to Nev.
+Version notes for 0.2: Basically, reprocessed the whole
+dataset again, due to a severe mistake in previously used pipeline,
+which left the data poisoned with a lot of non-unicode characters. Now,
+no more weird generation artifacts, and more stability. Major kudos to
+Cahvay for his work on fixing this critical issue.
+Prompt format is ChatML.
+Recommended sampler values:
+Temperature: 1
+Min-P: 0.05
+Top-A: 0.2
+Repetition Penalty: 1.03
+Recommended SillyTavern presets (via CalamitousFelicitousness):
+Context
+Instruct and System Prompt
+    Training data:
+Celeste 70B 0.1 data mixture minus Opus Instruct subset. See that model's card for details.
+Kalomaze's Opus_Instruct_25k dataset, filtered for refusals.
+A subset (1k rows) of ChatGPT-4o-WritingPrompts by Gryphe
+A subset (2k rows) of Sonnet3.5-Charcards-Roleplay by Gryphe
+Synthstruct and SynthRP datasets by Epiculous
+A subset from Dolphin-2.9.3, including filtered version of not_samantha and a small subset of systemchat.
+     Training time and hardware:
+7 hours on 8xH100 SXM, provided by FeatherlessAI
+Model was created by Kearm, Auri and Cahvay.
+Special thanks:
+to Cahvay for his work on investigating and reprocessing the
+corrupted dataset, removing the single biggest source of data poisoning.
+to FeatherlessAI for generously providing 8xH100 SXM node for training of this model
+to Gryphe, Lemmy, Kalomaze, Nopm, Epiculous and CognitiveComputations for the data
+and to Allura-org for support, feedback, beta-testing and doing quality control of EVA models.
+---
 ## Use with llama.cpp
 Install llama.cpp through brew (works on Mac and Linux)