Edit model card
YAML Metadata Warning: empty or missing yaml metadata in repo card (https://huggingface.co/docs/hub/model-cards#model-card-metadata)

This is a 4x8b Llama Mixture of Experts (MoE) model. It was trained on OpenHermes Resort from the Dolphin-2.9 dataset.

The model is a combination of 4 Llama fine-tunes, using DeepSpeed-MoE's architecture. All experts are active for every token.

This is a VERY good model, somewhere in between 8B and Llama 70B in capability. Enjoy!

Thank you to:

CrusoeEnergy for sponsoring the compute for this project
My collaborators Eric Hartford and Fernando (has too many names) Neto
Downloads last month
10
Safetensors
Model size
30.6B params
Tensor type
BF16
·
Inference API (serverless) does not yet support model repos that contain custom code.