StopTryharding
/

DareBeagel-2x7B-GGUF

Mixture of Experts

mlabonne/NeuralBeagle14-7B

mlabonne/NeuralDaredevil-7B

text-generation-inference

Text Generation

Inference Endpoints

Model card Files Files and versions Community

StopTryharding commited on Jan 28

Commit

3feef72

•

1 Parent(s): 5eadf18

Update README.md

Files changed (1) hide show

README.md +6 -20

README.md CHANGED Viewed

@@ -52,24 +52,10 @@ experts:
 ## 💻 Usage
-```python
-!pip install -qU transformers bitsandbytes accelerate
-from transformers import AutoTokenizer
-import transformers
-import torch
-model = "shadowml/Beyonder-2x7B-v2"
-tokenizer = AutoTokenizer.from_pretrained(model)
-pipeline = transformers.pipeline(
-    "text-generation",
-    model=model,
-    model_kwargs={"torch_dtype": torch.float16, "load_in_4bit": True},
-)
-messages = [{"role": "user", "content": "Explain what a Mixture of Experts is in less than 100 words."}]
-prompt = pipeline.tokenizer.apply_chat_template(messages, tokenize=False, add_generation_prompt=True)
-outputs = pipeline(prompt, max_new_tokens=256, do_sample=True, temperature=0.7, top_k=50, top_p=0.95)
-print(outputs[0]["generated_text"])
 ```

 ## 💻 Usage
+```
+Load in Kobold.cpp or whatever.  I found Alpaca (and Alpaca-ish) prompts worked well.  Settings that worked good for me are:
+Min P - 0.1
+Dynamic Temperature Min 0 Max 3
+Rep Pen 1.03
+Rep Pen Range 1000
 ```