llama-3-8b-merged-linear
Overview
This model represents a linear merge of three distinct Llama 3-8b models using the Mergekit tool. The primary goal of this merge is to leverage the unique strengths of each base model, such as multilingual capabilities and specialized domain knowledge, into a more versatile and generalized language model.
By merging these models linearly, we combine their expertise into a unified model that performs well across various tasks, such as text generation, multilingual understanding, and domain-specific tasks.
Model Details
Model Description
Models Used:
- Danielbrdz/Barcenas-Llama3-8b-ORPO
- DeepMount00/Llama-3-8b-Ita
- lightblue/suzume-llama-3-8B-multilingual
Merging Tool: Mergekit
Merge Method: Linear merge with equal weighting (1.0) for all models
Tokenizer Source: Union
Data Type: float16 (FP16) precision
License: MIT License
Languages Supported: Multilingual, including English, Italian, and potentially others from the multilingual base models
Configuration
The following YAML configuration was used to produce this model:
models:
- model: Danielbrdz/Barcenas-Llama3-8b-ORPO
parameters:
weight: 1.0
- model: DeepMount00/Llama-3-8b-Ita
parameters:
weight: 1.0
- model: lightblue/suzume-llama-3-8B-multilingual
parameters:
weight: 1.0
merge_method: linear
tokenizer_source: union
dtype: float16
- Downloads last month
- 24
Model tree for vhab10/llama-3-8b-merged-linear
Base model
meta-llama/Meta-Llama-3-8B