taesunwhang's picture
Upload LlamaForCausalLM
481cc0d verified
|
raw
history blame
1.43 kB
metadata
license: llama2
library_name: transformers
tags:
  - mergekit
  - merge
base_model:
  - lmsys/vicuna-7b-v1.5
  - meta-math/MetaMath-Llemma-7B

merge

This is a merge of pre-trained language models created using mergekit. Model merge (slerp) based on lmsys/vicuna-7b-v1.5 and meta-math/MetaMath-Llemma-7B

  1. Vicuna

    Model Details

    Vicuna is a chat assistant trained by fine-tuning Llama 2 on user-shared conversations collected from ShareGPT.

    • Developed by: LMSYS
    • Model type: An auto-regressive language model based on the transformer architecture
    • License: Llama 2 Community License Agreement
    • Finetuned from model: Llama 2

    Model Sources

  2. MetaMath Llemma

    Model Details

    MetaMath-Llemma-7B is fully fine-tuned on the MetaMathQA datasets and based on the powerful Llemma-7B model. It is glad to see using MetaMathQA datasets and change the base model from llama-2-7B to Llemma-7B can boost the MATH performance from 19.8 to 30.0.