qnguyen3's picture
Update README.md
ba40d8c verified
metadata
license: apache-2.0
tags:
  - merge
  - mergekit
  - vilm/vinallama-7b-chat

VinaLLaMA - State-of-the-art Vietnamese LLMs

image

Read our Paper

Prompt Format (ChatML):

<|im_start|>system
Bạn là một trợ lí AI hữu ích. Hãy trả lời người dùng một cách chính xác.
<|im_end|>
<|im_start|>user
Hello world!<|im_end|>
<|im_start|>assistant

Evaluation

This table is copied from VBD-Llama2 with updated results from VinaLLaMA-12.5B-chat-DUS

Model Model size arc_vi (acc) hellaswag_vi (acc) mmlu_vi (acc) truthfulqa_vi (acc) Average
URA-LLaMA-13B 13B 0,3752 0,4830 0,3973 0,4574 0,4282
BLOOMZ-7B 7B 0,3205 0,4930 0,3975 0,4523 0,4158
PhoGPT-7B5-Instruct 7B 0,2470 0,2578 0,2413 0,4759 0,3055
SeaLLM-7B-chat 7B 0,3607 0,5112 0,3339 0,4948 0,4252
Vietcuna-7b-v3 7B 0,3419 0,4939 0,3354 0,4807 0,4130
VinaLLaMA-2.7B-chat 7B 0,3273 0,4814 0,3051 0,4972 0,4028
VinaLLaMA-7B-chat 7B 0,4239 0,5407 0,3932 0,5251 0,4707
VBD-LLaMA2-7B-50b 7B 0,3222 0,5195 0,2964 0,4614 0,3999
VBD-LLaMA2-7B-50b-Chat 7B 0,3585 0,5207 0,3444 0,5179 0,4354
VinaLLaMA-12.5B-chat-DUS 12.5B 0,4325 0,5816 0,3875 0,5850 0,4967

Merging Methods

This model is a merge of the following models made with LazyMergekit:

🧩 Configuration

slices:
  - sources:
    - model: vilm/vinallama-7b-chat
      layer_range: [0, 16]
  - sources:
    - model: vilm/vinallama-7b-chat
      layer_range: [8, 16]
  - sources:
    - model: vilm/vinallama-7b-chat
      layer_range: [8, 16]      
  - sources:
    - model: vilm/vinallama-7b-chat
      layer_range: [16, 24]
  - sources:
    - model: vilm/vinallama-7b-chat
      layer_range: [16, 24]
  - sources:
    - model: vilm/vinallama-7b-chat
      layer_range: [24, 28]
  - sources:
    - model: vilm/vinallama-7b-chat
      layer_range: [24, 28]
  - sources:
    - model: vilm/vinallama-7b-chat
      layer_range: [28, 32]
merge_method: passthrough
dtype: bfloat16