Edit model card
YAML Metadata Warning: empty or missing yaml metadata in repo card (https://huggingface.co/docs/hub/model-cards#model-card-metadata)

This is a 4x8b Llama Mixture of Experts (MoE) model. It was trained on OpenHermes Resort from the Dolphin-2.9 dataset.

The model is a combination of 4 Llama fine-tunes, using DeepSpeed-MoE's architecture. All experts are active for every token.

This is a VERY good model, somewhere in between 8B and Llama 70B in capability. Enjoy!

Thank you to:

CrusoeEnergy for sponsoring the compute for this project
My collaborators Eric Hartford and Fernando (has too many names) Neto
Downloads last month
8
Safetensors
Model size
30.6B params
Tensor type
BF16
·
Inference API
Input a message to start chatting with Crystalcareai/LlaMoE-Medium.
Inference API (serverless) does not yet support model repos that contain custom code.