RDson's picture
Update README.md
55395ff verified
metadata
tags:
  - moe
  - llama
  - '3'
  - llama 3
  - 2x8b
drawing

Llama-3-Teal-Instruct-2x8B-MoE

This is a experimental MoE created from meta-llama/Meta-Llama-3-8B-Instruct and nvidia/Llama3-ChatQA-1.5-8B using Mergekit.

Green + Blue = Teal.

Mergekit yaml file:

base_model: Meta-Llama-3-8B-Instruct
experts:
  - source_model: Meta-Llama-3-8B-Instruct
    positive_prompts:
    - "explain"
    - "chat"
    - "assistant"
  - source_model: Llama3-ChatQA-1.5-8B
    positive_prompts:
    - "python"
    - "math"
    - "solve"
    - "code"
gate_mode: hidden
dtype: float16