did this have degenerate routing?

#4
by Kquant03 - opened

I purposely gave Azathoth degenerate parameters through making every prompt exactly the same. Wanted to see how badly it would affect it. Didn't hurt it too bad but I'm wondering if your prompts being somewhat similar gave it any degenerate routing.

My merge config😿

base_model: Meta-Llama-3-8B-Instruct
experts:
  - source_model: Meta-Llama-3-8B-Instruct
    positive_prompts:
    - "explain"
    - "chat"
    - "assistant"
  - source_model: Llama3-8B-OpenHermes-DPO
    positive_prompts:
    - "python"
    - "math"
    - "solve"
    - "code"
  - source_model: Llama-3-SLERP-8B
    positive_prompts:
    - "chat"
    - "assistant"
    - "AI"
  - source_model: hf-llama3-8b-orpo-v0.0
    positive_prompts:
    - "think"
    - "chat"
    - "code"
    - "roleplay"
gate_mode: hidden
dtype: float16

Yeah but like...when you ran the merge, in your terminal it will tell you which layers (if any) are degenerate

Well... It seems like I did not save it sorry...🤕

raincandy-u changed discussion status to closed

it's okay lol I was just wondering

Sign up or log in to comment