lighteternal's picture
Update README.md
ea87f9f verified
|
raw
history blame
3.52 kB
metadata
base_model:
  - meta-llama/Meta-Llama-3-8B-Instruct
  - NousResearch/Hermes-2-Pro-Llama-3-8B
  - aaditya/Llama3-OpenBioLLM-8B
library_name: transformers
tags:
  - mergekit
  - merge
license: llama3

Llama3-merge-biomed-8b

This is a DARE-TIES Merge of Llama3-8b-Instruct + NousResearch/Hermes-2-Pro-Llama-3-8B + aaditya/Llama3-OpenBioLLM-8B

Leaderboard metrics

Task Metric Llama3-merge-biomed-8b (%) Llama3-8B-Inst (%) Llama3-OpenBioLLM-8B (%)
ARC Challenge Accuracy 59.39 57.17 55.38
Normalized Accuracy 63.65 60.75 58.62
Hellaswag Accuracy 62.59 59.04 61.83
Normalized Accuracy 81.53 78.55 80.76
Winogrande Accuracy 75.93 74.51 70.88
GSM8K Accuracy 59.36 68.69 10.16
HendrycksTest-Average Accuracy 67.85 67.07 64.40
Normalized Accuracy 67.85 67.07 64.40
HendrycksTest-Anatomy Accuracy 72.59 65.19 56.30
HendrycksTest-Clinical Knowledge Accuracy 77.83 74.72 60.38
HendrycksTest-College Biology Accuracy 79.86 79.86 79.86
HendrycksTest-College Medicine Accuracy 70.81 63.58 62.28
HendrycksTest-Medical Genetics Accuracy 84.00 80.00 76.00
HendrycksTest-Professional Medicine Accuracy 71.69 71.69 69.41

This is a merge of pre-trained language models created using mergekit.

Merge Details

Merge Method

This model was merged using the DARE TIES merge method using meta-llama/Meta-Llama-3-8B-Instruct as a base.

Models Merged

The following models were included in the merge:

Configuration

The following YAML configuration was used to produce this model:

models:
  - model: meta-llama/Meta-Llama-3-8B-Instruct
    # Base model providing a general foundation without specific parameters

  - model: meta-llama/Meta-Llama-3-8B-Instruct
    parameters:
      density: 0.60  
      weight: 0.5  

  - model: NousResearch/Hermes-2-Pro-Llama-3-8B
    parameters:
      density: 0.55  
      weight: 0.1  

  - model: aaditya/Llama3-OpenBioLLM-8B
    parameters:
      density: 0.55  
      weight: 0.4 

merge_method: dare_ties
base_model: meta-llama/Meta-Llama-3-8B-Instruct
parameters:
  int8_mask: true
dtype: bfloat16