Edit model card

Llama-3-13B-Instruct

This is a merge of pre-trained language models created using mergekit.

The goal was to create a Llama 3 13B model to have a "mid" sized model which Meta has released in the past, but I would consider this a base model to be further finetuned on. Surprisingly, it is usable for chat and storywriting with Llama 3 Instruct template, though it does occasionally have some grammatical quirks like L3-120B.

Logical ability (programming, math, science, etc.) has been deteriorated by the merge process.

Use no repetition penalty or <1.05 or it might go a bit haywire, other than that, it is suitable for writing use. I have not tested it against L3 8B in that regard.

Finetuned Version

A finetuned version of this model can be found at elinas/Llama-3-13B-Instruct-ft which seems to improve performance.

Merge Details

Merge Method

This model was merged using the passthrough merge method.

Models Merged

The following models were included in the merge:

Configuration

The following YAML configuration was used to produce this model:

dtype: bfloat16
merge_method: passthrough
slices:
- sources:
  - layer_range: [0, 10]
    model: meta-llama/Meta-Llama-3-8B-Instruct
- sources:
  - layer_range: [5, 15]
    model: meta-llama/Meta-Llama-3-8B-Instruct
- sources:
  - layer_range: [10, 20]
    model: meta-llama/Meta-Llama-3-8B-Instruct
- sources:
  - layer_range: [15, 25]
    model: meta-llama/Meta-Llama-3-8B-Instruct
- sources:
  - layer_range: [20, 25]
    model: meta-llama/Meta-Llama-3-8B-Instruct
- sources:
  - layer_range: [22, 32]
    model: meta-llama/Meta-Llama-3-8B-Instruct

Model Evaluation

TBD - submitted

Downloads last month
292
Safetensors
Model size
13B params
Tensor type
BF16
·
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social visibility and check back later, or deploy to Inference Endpoints (dedicated) instead.

Finetuned from

Collection including elinas/Llama-3-13B-Instruct