Which Mergekit did you use for this?

#1
by softwareweaver - opened

Which Mergekit did you use for this? The standard one did not work. Thanks.

That's completely correct.

Mergekit has two issues to merge CohereForAI/c4ai-command-r-plus.

  • The layers added in c4ai-command-r-plus is not supported.
  • The lm_head section on cohere.json causes an unsupported model in llama.cpp.

So I wrote a patch for these issues:

--- a/mergekit/_data/architectures/cohere.json
+++ b/mergekit/_data/architectures/cohere.json
@@ -12,13 +12,6 @@
     "post_weights": [
         {
             "name": "model.norm.weight"
-        },
-        {
-            "name": "lm_head.weight",
-            "is_embed": true,
-            "aliases": [
-                "model.embed_tokens.weight"
-            ]
         }
     ],
     "num_layers_config_key": "num_hidden_layers",
@@ -36,9 +29,15 @@
             {
                 "name": "model.layers.${layer_index}.mlp.up_proj.weight"
             },
+            {
+                "name": "model.layers.${layer_index}.self_attn.q_norm.weight"
+            },
             {
                 "name": "model.layers.${layer_index}.self_attn.q_proj.weight"
             },
+           {
+                "name": "model.layers.${layer_index}.self_attn.k_norm.weight"
+            },
             {
                 "name": "model.layers.${layer_index}.self_attn.k_proj.weight"
             },

This is a hack, but it works fine for c4ai-command-r-plus self-merging.

Thanks. I will try that. I was thinking of merging command-r with softwareweaver/Twilight-Miqu-146B

nitky changed discussion status to closed

Sign up or log in to comment