Edit model card

Model Card for Model ID

在llama-2-13b上使用huangyt/FINETUNE4資料集進行訓練,總資料筆數約3.8w

Fine-Tuning Information

  • GPU: RTX4090 (single core / 24564MiB)
  • model: meta-llama/Llama-2-13b-hf
  • dataset: huangyt/FINETUNE3 (共約3.8w筆訓練集)
  • peft_type: LoRA
  • lora_rank: 16
  • lora_target: q_proj, k_proj, v_proj, o_proj
  • per_device_train_batch_size: 8
  • gradient_accumulation_steps: 8
  • learning_rate : 4e-4
  • epoch: 1
  • precision: bf16
  • quantization: load_in_4bit

Fine-Tuning Detail

  • train_loss: 0.579
  • train_runtime: 4:6:11 (use deepspeed)

Evaluation

  • 與Llama-2-13b比較4種Benchmark,包含ARCHellaSwagMMLUTruthfulQA
  • 評估結果使用本地所測的分數,並使用load_in_8bit
Model Average ARC HellaSwag MMLU TruthfulQA
FINETUNE4_3.8w-r4-q_k_v_o 56.67 52.13 79.38 54.54 40.64
FINETUNE4_3.8w-r8-q_k_v_o 56.84 52.30 79.58 54.50 40.98
FINETUNE4_3.8w-r16-q_k_v_o 57.28 53.92 79.92 55.61 39.65
FINETUNE4_3.8w-r4-gate_up_down 55.93 51.71 79.13 53.24 39.63
FINETUNE4_3.8w-r8-gate_up_down 55.93 51.37 79.29 53.62 39.45
FINETUNE4_3.8w-r16-gate_up_down 56.35 52.56 79.28 55.27 38.31
FINETUNE4_3.8w-r4-q_k_v_o_gate_up_down 56.42 53.92 79.09 53.93 38.74
FINETUNE4_3.8w-r8-q_k_v_o_gate_up_down 56.11 51.02 79.24 53.11 41.08
FINETUNE4_3.8w-r16-q_k_v_o_gate_up_down 56.83 53.67 79.49 54.79 39.36

  • 評估結果來自HuggingFaceH4/open_llm_leaderboard
Model Average ARC HellaSwag MMLU TruthfulQA
FINETUNE4_3.8w-r4-q_k_v_o 57.98 54.78 81.4 54.73 41.02
FINETUNE4_3.8w-r8-q_k_v_o 58.96 57.68 81.91 54.95 41.31
FINETUNE4_3.8w-r16-q_k_v_o 58.46 56.23 81.98 55.87 39.76
FINETUNE4_3.8w-r4-gate_up_down 57.94 55.8 81.74 55.09 39.12
FINETUNE4_3.8w-r8-gate_up_down 57.85 54.35 82.13 55.33 39.6
FINETUNE4_3.8w-r16-gate_up_down 57.93 55.03 81.97 56.64 38.07
FINETUNE4_3.8w-r4-q_k_v_o_gate_up_down 58.04 56.31 81.43 55.3 39.11
FINETUNE4_3.8w-r8-q_k_v_o_gate_up_down 58.16 55.97 81.53 54.42 40.72
FINETUNE4_3.8w-r16-q_k_v_o_gate_up_down 58.61 57.25 81.49 55.9 39.79

How to convert dataset to json

  • load_dataset中輸入資料集名稱,並且在take中輸入要取前幾筆資料
  • 觀察該資料集的欄位名稱,填入example欄位中(例如system_prompt、question、response)
  • 最後指定json檔儲存位置 (json_filename)
import json
from datasets import load_dataset

# 讀取數據集,take可以取得該數據集前n筆資料
dataset = load_dataset("huangyt/FINETUNE4", split="train", streaming=True)

# 提取所需欄位並建立新的字典列表
extracted_data = []
for example in dataset:
    extracted_example = {
        "instruction": example["instruction"],
        "input": example["input"],
        "output": example["output"]
    }
    extracted_data.append(extracted_example)

# 指定 JSON 文件名稱
json_filename = "FINETUNE4.json"

# 寫入 JSON 文件
with open(json_filename, "w") as json_file:
    json.dump(extracted_data, json_file, indent=4)

print(f"數據已提取並保存為 {json_filename}")
Downloads last month
1,065
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social visibility and check back later, or deploy to Inference Endpoints (dedicated) instead.

Dataset used to train CHIH-HUNG/llama-2-13b-FINETUNE4_3.8w-r4-q_k_v_o