Edit model card

Description

After I put down the joint and RTFM, I have a better idea exactly what's going on. I considered doing something similar with WANDA or SparseGPT a while back, but stopped when I ran into issues. Thus, I'm fascinated by this new method's execution.

Hypothesis

By lowering the density, I hit closer to the sweet-spot shown in the paper. Also, I'm using my fixed base model, so hopefully that helps too. Weights are adjusted to make the later layers more aligned with ORCA 2.

Results

I'm quite happy with this model for what it is, a personable and effective assistant. It does infodump a bit, but what genius doesn't? It writes okay erotica and general fiction, it just has an "artifical" tone.

Recipe

merge_method: dare_ties

  • base_model: athirdpath/BigLlama-20b

  • model: athirdpath/CleverGirl-20b

    weight: 0.60 / density: 0.35

  • model: athirdpath/CleverGirl-20b-Inverted

    weight: 0.40 / density: 0.30

int8_mask: true

dtype: bfloat16

Downloads last month
0
Unable to determine this model's library. Check the docs .

Collection including athirdpath/CleverGirl-20b-Blended-v1.1-DARE-GGUF