Triangle104
/

Roleplay-Hermes-3-Llama-3.1-8B-Q6_K-GGUF

Inference Endpoints

Model card Files Files and versions Community

Triangle104 commited on 26 days ago

Commit

53130dc

•

1 Parent(s): bf48139

Update README.md

Files changed (1) hide show

README.md +43 -0

README.md CHANGED Viewed

@@ -17,6 +17,49 @@ base_model: vicgalle/Roleplay-Hermes-3-Llama-3.1-8B
 This model was converted to GGUF format from [`vicgalle/Roleplay-Hermes-3-Llama-3.1-8B`](https://huggingface.co/vicgalle/Roleplay-Hermes-3-Llama-3.1-8B) using llama.cpp via the ggml.ai's [GGUF-my-repo](https://huggingface.co/spaces/ggml-org/gguf-my-repo) space.
 Refer to the [original model card](https://huggingface.co/vicgalle/Roleplay-Hermes-3-Llama-3.1-8B) for more details on the model.
 ## Use with llama.cpp
 Install llama.cpp through brew (works on Mac and Linux)

 This model was converted to GGUF format from [`vicgalle/Roleplay-Hermes-3-Llama-3.1-8B`](https://huggingface.co/vicgalle/Roleplay-Hermes-3-Llama-3.1-8B) using llama.cpp via the ggml.ai's [GGUF-my-repo](https://huggingface.co/spaces/ggml-org/gguf-my-repo) space.
 Refer to the [original model card](https://huggingface.co/vicgalle/Roleplay-Hermes-3-Llama-3.1-8B) for more details on the model.
+---
+Model details:
+-
+A DPO-tuned Hermes-3-Llama-3.1-8B to behave more "humanish", i.e.,
+avoiding AI assistant slop. It also works for role-play (RP). To achieve
+ this, the model was fine-tuned over a series of datasets:
+Undi95/Weyaxi-humanish-dpo-project-noemoji, to make the model react as a human, rejecting assistant-like or too neutral responses.
+ResplendentAI/NSFW_RP_Format_DPO, to steer the model
+towards using the *action* format in RP settings. Works best if in the
+first message you also use this format naturally (see example)
+		Usage example
+conversation = [{'role': 'user', 'content': """*With my face blushing in red* Tell me about your favorite film!"""}]
+prompt = tokenizer.apply_chat_template(conversation, tokenize=False, add_generation_prompt=True)
+inputs = tokenizer(prompt, return_tensors="pt").to(model.device)
+outputs = model.generate(**inputs, max_new_tokens=512, do_sample=True, temperature=0.8)
+The response is
+*blushing* Aw, that's a tough one! There are so many great films out
+there. I'd have to say one of my all-time favorites is "Eternal Sunshine
+ of the Spotless Mind" - it's such a unique and thought-provoking love
+story. But really, there are so many amazing films! What's your
+favorite? *I hope mine is at least somewhat decent!*
+Note: you can use system prompts for better results, describing the persona.
+---
 ## Use with llama.cpp
 Install llama.cpp through brew (works on Mac and Linux)