--- base_model: - mlabonne/NeuralHermes-2.5-Mistral-7B - mistralai/Mistral-7B-v0.1 - OpenPipe/mistral-ft-optimized-1218 tags: - mergekit - merge license: apache-2.0 language: - en - fr - ar library_name: transformers pipeline_tag: text-generation --- ### Merge Method This model was merged using the [TIES](https://arxiv.org/abs/2306.01708) merge method using [mistralai/Mistral-7B-v0.1](https://huggingface.co/mistralai/Mistral-7B-v0.1) as a base. ### TIES : ![image/jpeg](https://cdn-uploads.huggingface.co/production/uploads/6338c06c107c4835a05699f9/FnHh98ld6KSj2RomTiv6K.jpeg) In Yadav et al.'s paper, the **TIES-Merging** technique is introduced as an efficient approach for consolidating multiple task-specific models into a unified multitask model. This method addresses two primary challenges associated with model merging. Firstly, it tackles redundancy in model parameters by identifying and eliminating redundant parameters within task-specific models. This is achieved by focusing on the changes made during fine-tuning, identifying the top-k% most significant changes, and discarding the remainder. Secondly, TIES-Merging addresses conflicts arising from disagreement between parameter signs, where different models suggest opposing adjustments to the same parameter. To resolve these conflicts, TIES-Merging creates a unified sign vector representing the most dominant direction of change across all models. The TIES-Merging process is structured into three key steps: Trim, which reduces redundancy by retaining a fraction of the most significant parameters; Elect Sign, which resolves sign conflicts by establishing a unified sign vector based on the dominant direction; and Disjoint Merge, which averages parameter values aligning with the unified sign vector while excluding zero values. ### Models Merged The following models were included in the merge: * [mlabonne/NeuralHermes-2.5-Mistral-7B](https://huggingface.co/mlabonne/NeuralHermes-2.5-Mistral-7B) * [OpenPipe/mistral-ft-optimized-1218](https://huggingface.co/OpenPipe/mistral-ft-optimized-1218) ### Configuration The following YAML configuration was used to produce this model: ```yaml models: - model: mistralai/Mistral-7B-v0.1 # no parameters necessary for base model - model: OpenPipe/mistral-ft-optimized-1218 parameters: density: 0.5 weight: 0.5 - model: mlabonne/NeuralHermes-2.5-Mistral-7B parameters: density: 0.5 weight: 0.5 merge_method: ties base_model: mistralai/Mistral-7B-v0.1 parameters: normalize: true dtype: float16 ``` ## Usage : ```python # Load model directly from transformers import AutoTokenizer, AutoModelForCausalLM tokenizer = AutoTokenizer.from_pretrained("ayoubkirouane/Mistral-TIES-Merged7B") model = AutoModelForCausalLM.from_pretrained("ayoubkirouane/Mistral-TIES-Merged7B") ## Load it in 4 bit : from transformers import AutoModelForCausalLM, AutoTokenizer, BitsAndBytesConfig import torch nf4_config = BitsAndBytesConfig( load_in_4bit=True, bnb_4bit_quant_type="nf4", bnb_4bit_use_double_quant=True, bnb_4bit_compute_dtype=torch.bfloat16 ) model = AutoModelForCausalLM.from_pretrained( "ayoubkirouane/Mistral-TIES-Merged7B", device_map='auto', quantization_config=nf4_config, use_cache=False ) tokenizer = AutoTokenizer.from_pretrained("ayoubkirouane/Mistral-TIES-Merged7B") tokenizer.pad_token = tokenizer.eos_token tokenizer.padding_side = "right" def generate_response(prompt, model , max_new_tokens): encoded_input = tokenizer(prompt, return_tensors="pt", add_special_tokens=True) model_inputs = encoded_input.to('cuda') generated_ids = model.generate(**model_inputs, max_new_tokens=max_new_tokens, do_sample=True, pad_token_id=tokenizer.eos_token_id) decoded_output = tokenizer.batch_decode(generated_ids) return decoded_output[0].replace(prompt, "") ```