llama-3-8b-merged-linear

Overview

This model represents a linear merge of three distinct Llama 3-8b models using the Mergekit tool. The primary goal of this merge is to leverage the unique strengths of each base model, such as multilingual capabilities and specialized domain knowledge, into a more versatile and generalized language model.

By merging these models linearly, we combine their expertise into a unified model that performs well across various tasks, such as text generation, multilingual understanding, and domain-specific tasks.

Model Details

Model Description

  • Models Used:

    • Danielbrdz/Barcenas-Llama3-8b-ORPO
    • DeepMount00/Llama-3-8b-Ita
    • lightblue/suzume-llama-3-8B-multilingual
  • Merging Tool: Mergekit

  • Merge Method: Linear merge with equal weighting (1.0) for all models

  • Tokenizer Source: Union

  • Data Type: float16 (FP16) precision

  • License: MIT License

  • Languages Supported: Multilingual, including English, Italian, and potentially others from the multilingual base models

Configuration

The following YAML configuration was used to produce this model:

models:
  - model: Danielbrdz/Barcenas-Llama3-8b-ORPO
    parameters:
      weight: 1.0
  - model: DeepMount00/Llama-3-8b-Ita
    parameters:
      weight: 1.0
  - model: lightblue/suzume-llama-3-8B-multilingual
    parameters:
      weight: 1.0
merge_method: linear
tokenizer_source: union
dtype: float16
Downloads last month
24
Safetensors
Model size
4.65B params
Tensor type
FP16
F32
U8
Inference Examples
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social visibility and check back later, or deploy to Inference Endpoints (dedicated) instead.

Model tree for vhab10/llama-3-8b-merged-linear

Quantized
(240)
this model