--- license: apache-2.0 library_name: transformers pipeline_tag: text-generation tags: - mergekit - merge --- # llama-3-moe-3x8b This is a merge of pre-trained language models created using [mergekit](https://github.com/cg123/mergekit). ## Merge Details ### Models Merged The following models were included in the merge: * [meta-llama/Meta-Llama-3-8B](https://huggingface.co/meta-llama/Meta-Llama-3-8B) * [codellama/CodeLlama-7b-Instruct-hf](https://huggingface.co/codellama/CodeLlama-7b-Instruct-hf) * [meta-math/MetaMath-Mistral-7B](https://huggingface.co/meta-math/MetaMath-Mistral-7B) ### Configuration The following YAML configuration was used to produce this model: ```yaml base_model: meta-llama/Meta-Llama-3-8B-Instruct gate_mode: hidden # one of "hidden", "cheap_embed", or "random" dtype: bfloat16 # output dtype (float32, float16, or bfloat16) ## (optional) # experts_per_token: 2 experts: - source_model: meta-llama/Meta-Llama-3-8B-Instruct positive_prompts: - "chat" - "assistant" - "tell me" - "explain" - "I want" ## (optional) # negative_prompts: # - "This is a prompt expert_model_1 should not be used for" - source_model: codellama/CodeLlama-7b-Instruct-hf positive_prompts: - "code" - "python" - "javascript" - "programming" - "algorithm" - "C#" - "C++" - "debug" - "runtime" - "html" - "command" - "nodejs" - source_model: meta-math/MetaMath-Mistral-7B positive_prompts: - "reason" - "math" - "mathematics" - "solve" - "count" - "calculate" - "arithmetic" - "algebra" ```