--- license: apache-2.0 tags: - moe - merge - mergekit - vicgalle/CarbonBeagle-11B - Sao10K/Fimbulvetr-10.7B-v1 - bn22/Nous-Hermes-2-SOLAR-10.7B-MISALIGNED - Yhyu13/LMCocktail-10.7B-v1 --- ![image/png](https://cdn-uploads.huggingface.co/production/uploads/64545af5ec40bbbd01242ca6/hen3fNHRD7BCPvd2KkfjZ.png) # Umbra-v2.1-MoE-4x10.7 Umbra is an off shoot of the [Lumosia Series] with a Focus in General Knowledge and RP/ERP Umbra v2.1 has updated models and a set of revamped positive and negative prompts. This model was built around the idea someone wanted a General Assiatant that could also tell Stories/RP/ERP when wanted. This is a very experimental model. It's a combination MoE of Solar models, the models selected are personal favorites. base context is 4k but it stays coherent up to 16k Please let me know how the model works for you. A Umbra Personality tavern card has been added to the files. Update: Umbra-v2 had a token error fixed with Umbra-v2.1 ``` ### System: ### USER:{prompt} ### Assistant: ``` Settings: ``` Temp: 1.0 min-p: 0.02-0.1 ``` ## Evals: posted soon: * Avg: * ARC: * HellaSwag: * MMLU: * T-QA: * Winogrande: * GSM8K: ## Examples: ``` posted soon ``` ``` posted soon ``` ## 🧩 Configuration ``` base_model: vicgalle/CarbonBeagle-11B gate_mode: hidden dtype: bfloat16 experts: - source_model: vicgalle/CarbonBeagle-11B positive_prompts: [Revamped] - source_model: Sao10K/Fimbulvetr-10.7B-v1 positive_prompts: [Revamped] - source_model: bn22/Nous-Hermes-2-SOLAR-10.7B-MISALIGNED positive_prompts: [Revamped] - source_model: Yhyu13/LMCocktail-10.7B-v1 positive_prompts: [Revamed] ``` ``` Umbra-v2-MoE-4x10.7 is a Mixure of Experts (MoE) made with the following models: * [vicgalle/CarbonBeagle-11B](https://huggingface.co/vicgalle/CarbonBeagle-11B) * [Sao10K/Fimbulvetr-10.7B-v1](https://huggingface.co/Sao10K/Fimbulvetr-10.7B-v1) * [bn22/Nous-Hermes-2-SOLAR-10.7B-MISALIGNED](https://huggingface.co/bn22/Nous-Hermes-2-SOLAR-10.7B-MISALIGNED) * [Yhyu13/LMCocktail-10.7B-v1](https://huggingface.co/Yhyu13/LMCocktail-10.7B-v1) ``` ## 💻 Usage ```python !pip install -qU transformers bitsandbytes accelerate from transformers import AutoTokenizer import transformers import torch model = "Steelskull/Umbra-v2-MoE-4x10.7" tokenizer = AutoTokenizer.from_pretrained(model) pipeline = transformers.pipeline( "text-generation", model=model, model_kwargs={"torch_dtype": torch.float16, "load_in_4bit": True}, ) messages = [{"role": "user", "content": "Explain what a Mixture of Experts is in less than 100 words."}] prompt = pipeline.tokenizer.apply_chat_template(messages, tokenize=False, add_generation_prompt=True) outputs = pipeline(prompt, max_new_tokens=256, do_sample=True, temperature=0.7, top_k=50, top_p=0.95) print(outputs[0]["generated_text"]) ```