--- base_model: - saishf/Long-Neural-SOVLish-Devil-8B-L3-262K - saishf/Merge-Mayhem-L3-V2 - saishf/Neural-SOVLish-Devil-8B-L3 - saishf/SOVLish-Maid-L3-8B - saishf/Merge-Mayhem-L3-V2.1 library_name: transformers tags: - mergekit - merge license: cc-by-nc-4.0 --- # merge This is a merge of pre-trained language models created using [mergekit](https://github.com/cg123/mergekit). ## Merge Details **Experimental** This model is a attempt to push [saishf/SOVL-Mega-Mash-V2-L3-8B](https://huggingface.co/saishf/SOVL-Mega-Mash-V2-L3-8B) (my personal favourite model) to support 32K+ context. ### Merge Method This model was merged using the [Model Stock](https://arxiv.org/abs/2403.19522) merge method using [saishf/Long-Neural-SOVLish-Devil-8B-L3-262K](https://huggingface.co/saishf/Long-Neural-SOVLish-Devil-8B-L3-262K) as a base. ### Models Merged The following models were included in the merge: * [saishf/Merge-Mayhem-L3-V2](https://huggingface.co/saishf/Merge-Mayhem-L3-V2) * [saishf/Neural-SOVLish-Devil-8B-L3](https://huggingface.co/saishf/Neural-SOVLish-Devil-8B-L3) * [saishf/SOVLish-Maid-L3-8B](https://huggingface.co/saishf/SOVLish-Maid-L3-8B) * [saishf/Merge-Mayhem-L3-V2.1](https://huggingface.co/saishf/Merge-Mayhem-L3-V2.1) ### Configuration The following YAML configuration was used to produce this model: ```yaml models: - model: saishf/Neural-SOVLish-Devil-8B-L3 - model: saishf/Merge-Mayhem-L3-V2 - model: saishf/Merge-Mayhem-L3-V2.1 - model: saishf/SOVLish-Maid-L3-8B merge_method: model_stock base_model: saishf/Long-Neural-SOVLish-Devil-8B-L3-262K dtype: bfloat16 ```