--- base_model: - google/gemma-2-2b-it tags: - merge - mergekit - lazymergekit - google/gemma-2-2b-it --- # gemma-instruct-merge-named_correctly gemma-instruct-merge-named_correctly is a merge of the following models using [LazyMergekit](https://colab.research.google.com/drive/1obulZ1ROXHjYLn6PPZJwRR6GzgQogxxb?usp=sharing): * [google/gemma-2-2b-it](https://huggingface.co/google/gemma-2-2b-it) ## 🧩 Configuration ```yaml models: - model: google/gemma-2-2b - model: google/gemma-2-2b-it parameters: density: - filter: model.layers.0.self_attn.q_proj value: 0.00539 - filter: model.layers.1.self_attn.q_proj value: 0.3256 - filter: model.layers.2.self_attn.q_proj value: 0.03843 - filter: model.layers.3.self_attn.q_proj value: 0.14886 - filter: model.layers.4.self_attn.q_proj value: 0.41323 - filter: model.layers.5.self_attn.q_proj value: 0.33564 - filter: model.layers.6.self_attn.q_proj value: 0.03716 - filter: model.layers.7.self_attn.q_proj value: 0.1084 - filter: model.layers.8.self_attn.q_proj value: 0.16658 - filter: model.layers.9.self_attn.q_proj value: 0.19716 - filter: model.layers.10.self_attn.q_proj value: 0.10008 - filter: model.layers.11.self_attn.q_proj value: 0.37198 - filter: model.layers.12.self_attn.q_proj value: 0.3622 - filter: model.layers.13.self_attn.q_proj value: 0.26806 - filter: model.layers.14.self_attn.q_proj value: 0.73192 - filter: model.layers.15.self_attn.q_proj value: 0.14731 - filter: model.layers.16.self_attn.q_proj value: 0.46923 - filter: model.layers.17.self_attn.q_proj value: 0.63063 - filter: model.layers.18.self_attn.q_proj value: 0.65673 - filter: model.layers.19.self_attn.q_proj value: 0.66558 - filter: model.layers.20.self_attn.q_proj value: 0.63758 - filter: model.layers.21.self_attn.q_proj value: 0.64362 - filter: model.layers.22.self_attn.q_proj value: 0.40341 - filter: model.layers.23.self_attn.q_proj value: 0.12982 - filter: model.layers.24.self_attn.q_proj value: 0.04552 - filter: model.layers.25.self_attn.q_proj value: 0.03919 - filter: model.layers.0.self_attn.k_proj value: 0.00592 - filter: model.layers.1.self_attn.k_proj value: 0.18684 - filter: model.layers.2.self_attn.k_proj value: 0.02603 - filter: model.layers.3.self_attn.k_proj value: 0.07283 - filter: model.layers.4.self_attn.k_proj value: 0.27067 - filter: model.layers.5.self_attn.k_proj value: 0.18189 - filter: model.layers.6.self_attn.k_proj value: 0.10695 - filter: model.layers.7.self_attn.k_proj value: 0.17169 - filter: model.layers.8.self_attn.k_proj value: 0.08753 - filter: model.layers.9.self_attn.k_proj value: 0.12807 - filter: model.layers.10.self_attn.k_proj value: 0.07783 - filter: model.layers.11.self_attn.k_proj value: 0.41518 - filter: model.layers.12.self_attn.k_proj value: 0.42307 - filter: model.layers.13.self_attn.k_proj value: 0.2896 - filter: model.layers.14.self_attn.k_proj value: 0.52953 - filter: model.layers.15.self_attn.k_proj value: 0.25035 - filter: model.layers.16.self_attn.k_proj value: 0.57679 - filter: model.layers.17.self_attn.k_proj value: 0.52768 - filter: model.layers.18.self_attn.k_proj value: 0.666 - filter: model.layers.19.self_attn.k_proj value: 0.67796 - filter: model.layers.20.self_attn.k_proj value: 0.58825 - filter: model.layers.21.self_attn.k_proj value: 0.51447 - filter: model.layers.22.self_attn.k_proj value: 0.33317 - filter: model.layers.23.self_attn.k_proj value: 0.05987 - filter: model.layers.24.self_attn.k_proj value: 0.02903 - filter: model.layers.25.self_attn.k_proj value: 0.08715 - filter: model.layers.0.self_attn.v_proj value: 0.03025 - filter: model.layers.1.self_attn.v_proj value: 0.14137 - filter: model.layers.2.self_attn.v_proj value: 0.00286 - filter: model.layers.3.self_attn.v_proj value: 0.09155 - filter: model.layers.4.self_attn.v_proj value: 0.25172 - filter: model.layers.5.self_attn.v_proj value: 0.2062 - filter: model.layers.6.self_attn.v_proj value: 0.06811 - filter: model.layers.7.self_attn.v_proj value: 0.01334 - filter: model.layers.8.self_attn.v_proj value: 0.18251 - filter: model.layers.9.self_attn.v_proj value: 0.10415 - filter: model.layers.10.self_attn.v_proj value: 0.04 - filter: model.layers.11.self_attn.v_proj value: 0.22664 - filter: model.layers.12.self_attn.v_proj value: 0.16337 - filter: model.layers.13.self_attn.v_proj value: 0.09347 - filter: model.layers.14.self_attn.v_proj value: 0.52549 - filter: model.layers.15.self_attn.v_proj value: 0.02152 - filter: model.layers.16.self_attn.v_proj value: 0.25325 - filter: model.layers.17.self_attn.v_proj value: 0.43041 - filter: model.layers.18.self_attn.v_proj value: 0.49113 - filter: model.layers.19.self_attn.v_proj value: 0.32307 - filter: model.layers.20.self_attn.v_proj value: 0.41922 - filter: model.layers.21.self_attn.v_proj value: 0.32166 - filter: model.layers.22.self_attn.v_proj value: 0.34057 - filter: model.layers.23.self_attn.v_proj value: 0.24319 - filter: model.layers.24.self_attn.v_proj value: 0.07956 - filter: model.layers.25.self_attn.v_proj value: 0.13849 - filter: model.layers.0.self_attn.o_proj value: 0.03265 - filter: model.layers.1.self_attn.o_proj value: 0.22645 - filter: model.layers.2.self_attn.o_proj value: 0.06134 - filter: model.layers.3.self_attn.o_proj value: 0.16228 - filter: model.layers.4.self_attn.o_proj value: 0.07924 - filter: model.layers.5.self_attn.o_proj value: 0.1259 - filter: model.layers.6.self_attn.o_proj value: 0.09982 - filter: model.layers.7.self_attn.o_proj value: 0.02826 - filter: model.layers.8.self_attn.o_proj value: 0.03906 - filter: model.layers.9.self_attn.o_proj value: 1.0 - filter: model.layers.10.self_attn.o_proj value: 0.51227 - filter: model.layers.11.self_attn.o_proj value: 0.34671 - filter: model.layers.12.self_attn.o_proj value: 0.70159 - filter: model.layers.13.self_attn.o_proj value: 0.4424 - filter: model.layers.14.self_attn.o_proj value: 0.74354 - filter: model.layers.15.self_attn.o_proj value: 0.16266 - filter: model.layers.16.self_attn.o_proj value: 0.71896 - filter: model.layers.17.self_attn.o_proj value: 0.53179 - filter: model.layers.18.self_attn.o_proj value: 0.18472 - filter: model.layers.19.self_attn.o_proj value: 0.09507 - filter: model.layers.20.self_attn.o_proj value: 0.13472 - filter: model.layers.21.self_attn.o_proj value: 0.23246 - filter: model.layers.22.self_attn.o_proj value: 0.10599 - filter: model.layers.23.self_attn.o_proj value: 0.00282 - filter: model.layers.24.self_attn.o_proj value: 0.09864 - filter: model.layers.25.self_attn.o_proj value: 0.00961 - filter: model.layers.0.mlp.gate_proj value: 0.4928 - filter: model.layers.1.mlp.gate_proj value: 0.08775 - filter: model.layers.2.mlp.gate_proj value: 0.0002 - filter: model.layers.3.mlp.gate_proj value: 0.12885 - filter: model.layers.4.mlp.gate_proj value: 0.50973 - filter: model.layers.5.mlp.gate_proj value: 0.29085 - filter: model.layers.6.mlp.gate_proj value: 0.06577 - filter: model.layers.7.mlp.gate_proj value: 0.11425 - filter: model.layers.8.mlp.gate_proj value: 0.12968 - filter: model.layers.9.mlp.gate_proj value: 0.19222 - filter: model.layers.10.mlp.gate_proj value: 0.17268 - filter: model.layers.11.mlp.gate_proj value: 0.10512 - filter: model.layers.12.mlp.gate_proj value: 0.02651 - filter: model.layers.13.mlp.gate_proj value: 0.04687 - filter: model.layers.14.mlp.gate_proj value: 0.14716 - filter: model.layers.15.mlp.gate_proj value: 0.03147 - filter: model.layers.16.mlp.gate_proj value: 0.05726 - filter: model.layers.17.mlp.gate_proj value: 0.04511 - filter: model.layers.18.mlp.gate_proj value: 0.25063 - filter: model.layers.19.mlp.gate_proj value: 0.26688 - filter: model.layers.20.mlp.gate_proj value: 0.37716 - filter: model.layers.21.mlp.gate_proj value: 0.3862 - filter: model.layers.22.mlp.gate_proj value: 0.27378 - filter: model.layers.23.mlp.gate_proj value: 0.08641 - filter: model.layers.24.mlp.gate_proj value: 0.37212 - filter: model.layers.25.mlp.gate_proj value: 0.31151 - filter: model.layers.0.mlp.up_proj value: 0.24228 - filter: model.layers.1.mlp.up_proj value: 0.06887 - filter: model.layers.2.mlp.up_proj value: 0.20625 - filter: model.layers.3.mlp.up_proj value: 0.27546 - filter: model.layers.4.mlp.up_proj value: 0.59112 - filter: model.layers.5.mlp.up_proj value: 0.41203 - filter: model.layers.6.mlp.up_proj value: 0.07411 - filter: model.layers.7.mlp.up_proj value: 0.05424 - filter: model.layers.8.mlp.up_proj value: 0.23792 - filter: model.layers.9.mlp.up_proj value: 0.1677 - filter: model.layers.10.mlp.up_proj value: 0.17054 - filter: model.layers.11.mlp.up_proj value: 0.18762 - filter: model.layers.12.mlp.up_proj value: 0.08044 - filter: model.layers.13.mlp.up_proj value: 0.0021 - filter: model.layers.14.mlp.up_proj value: 0.26389 - filter: model.layers.15.mlp.up_proj value: 0.06886 - filter: model.layers.16.mlp.up_proj value: 0.21476 - filter: model.layers.17.mlp.up_proj value: 0.21777 - filter: model.layers.18.mlp.up_proj value: 0.32111 - filter: model.layers.19.mlp.up_proj value: 0.2885 - filter: model.layers.20.mlp.up_proj value: 0.40549 - filter: model.layers.21.mlp.up_proj value: 0.42539 - filter: model.layers.22.mlp.up_proj value: 0.31214 - filter: model.layers.23.mlp.up_proj value: 0.02931 - filter: model.layers.24.mlp.up_proj value: 0.3693 - filter: model.layers.25.mlp.up_proj value: 0.345 - filter: model.layers.0.mlp.down_proj value: 0.06756 - filter: model.layers.1.mlp.down_proj value: 0.03746 - filter: model.layers.2.mlp.down_proj value: 0.09104 - filter: model.layers.3.mlp.down_proj value: 0.06643 - filter: model.layers.4.mlp.down_proj value: 0.05003 - filter: model.layers.5.mlp.down_proj value: 0.0406 - filter: model.layers.6.mlp.down_proj value: 0.01609 - filter: model.layers.7.mlp.down_proj value: 0.09629 - filter: model.layers.8.mlp.down_proj value: 0.08912 - filter: model.layers.9.mlp.down_proj value: 0.12874 - filter: model.layers.10.mlp.down_proj value: 0.04635 - filter: model.layers.11.mlp.down_proj value: 0.0099 - filter: model.layers.12.mlp.down_proj value: 0.03487 - filter: model.layers.13.mlp.down_proj value: 0.04977 - filter: model.layers.14.mlp.down_proj value: 0.00393 - filter: model.layers.15.mlp.down_proj value: 0.00748 - filter: model.layers.16.mlp.down_proj value: 0.06696 - filter: model.layers.17.mlp.down_proj value: 0.02067 - filter: model.layers.18.mlp.down_proj value: 0.10311 - filter: model.layers.19.mlp.down_proj value: 0.009 - filter: model.layers.20.mlp.down_proj value: 0.0215 - filter: model.layers.21.mlp.down_proj value: 0.04196 - filter: model.layers.22.mlp.down_proj value: 0.06326 - filter: model.layers.23.mlp.down_proj value: 0.21737 - filter: model.layers.24.mlp.down_proj value: 0.19338 - filter: model.layers.25.mlp.down_proj value: 0.04905 weight: - value: 1 merge_method: ties base_model: google/gemma-2-2b parameters: normalize: true int8_mask: true dtype: bfloat16 tokenizer_source: union ``` ## 💻 Usage ```python !pip install -qU transformers accelerate from transformers import AutoTokenizer import transformers import torch model = "choprahetarth/gemma-instruct-merge-named_correctly" messages = [{"role": "user", "content": "What is a large language model?"}] tokenizer = AutoTokenizer.from_pretrained(model) prompt = tokenizer.apply_chat_template(messages, tokenize=False, add_generation_prompt=True) pipeline = transformers.pipeline( "text-generation", model=model, torch_dtype=torch.float16, device_map="auto", ) outputs = pipeline(prompt, max_new_tokens=256, do_sample=True, temperature=0.7, top_k=50, top_p=0.95) print(outputs[0]["generated_text"]) ```