Text Generation
Transformers
Safetensors
5 languages
mixtral
mixtral-8x22b
mixtral-8x7b
instruct
Mixture of Experts
Merge
conversational
Inference Endpoints
text-generation-inference
Edit model card

Gixtral 100B (Mixtral from 8x22B & 8x7B to 100B)

logo

We created a model from other cool models to combine everything into one cool model.

Model Details

Model Description

How to Get Started with the Model

Use the code below to get started with the model.

from transformers import AutoModelForCausalLM, AutoTokenizer

model_id = "ehristoforu/Gixtral-100B"
tokenizer = AutoTokenizer.from_pretrained(model_id)

model = AutoModelForCausalLM.from_pretrained(model_id, device_map="auto")

messages = [
    {"role": "user", "content": "What is your favourite condiment?"},
    {"role": "assistant", "content": "Well, I'm quite partial to a good squeeze of fresh lemon juice. It adds just the right amount of zesty flavour to whatever I'm cooking up in the kitchen!"},
    {"role": "user", "content": "Do you have mayonnaise recipes?"}
]

inputs = tokenizer.apply_chat_template(messages, return_tensors="pt").to("cuda")

outputs = model.generate(inputs, max_new_tokens=20)
print(tokenizer.decode(outputs[0], skip_special_tokens=True))

About merge

Base model: mistralai/Mixtral-8x22B-Instruct-v0.1 & mistralai/Mixtral-8x7B-Instruct-v0.1

Merge models:

  • mistralai/Mixtral-8x22B-Instruct-v0.1
  • mistralai/Mixtral-8x7B-Instruct-v0.1
  • cognitivecomputations/dolphin-2.7-mixtral-8x7b
  • alpindale/WizardLM-2-8x22B

Merge datasets:

  • ehartford/dolphin
  • jondurbin/airoboros-2.2.1
  • ehartford/dolphin-coder
  • migtissera/Synthia-v1.3
  • teknium/openhermes
  • ise-uiuc/Magicoder-OSS-Instruct-75K
  • ise-uiuc/Magicoder-Evol-Instruct-110K
  • LDJnr/Pure-Dove
Downloads last month
236
Safetensors
Model size
97.9B params
Tensor type
BF16
·

Merge of

Datasets used to train ehristoforu/Gixtral-100B

Collection including ehristoforu/Gixtral-100B