REILX
/

Llama-3-8B-Instruct-ruozhiba-lora

Model card Files Files and versions Community

基于ruozhiba对Llama-3-8B-Instruct进行微调。

模型：

https://huggingface.co/meta-llama/Meta-Llama-3-8B-Instruct

数据集：

https://huggingface.co/datasets/LooksJuicy/ruozhiba

训练工具

https://github.com/hiyouga/LLaMA-Factory

测评方式：

使用opencompass(https://github.com/open-compass/OpenCompass/ )，测试工具基于CEval和MMLU对微调之后的模型和原始模型进行测试。
测试模型分别为：

Llama-3-8B
Llama-3-8B-Instruct
LLama3-Instruct-sft-ruozhiba,使用ruozhiba数据对Llama-3-8B-Instruct使用sft方式lora微调

结果

模型名称	CEVAL	MMLU
LLama3	49.91	66.62
LLama3-Instruct	50.55	67.15
LLama3-Instruct-sft-ruozhiba-3epoch	50.87	67.51

Training hyperparameters

The following hyperparameters were used during training:

learning_rate: 5e-05
train_batch_size: 1
eval_batch_size: 1
seed: 42
distributed_type: multi-GPU
num_devices: 8
gradient_accumulation_steps: 2
total_train_batch_size: 16
total_eval_batch_size: 8
optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
lr_scheduler_type: cosine
lr_scheduler_warmup_steps: 20
num_epochs: 3.0
mixed_precision_training: Native AMP

Downloads last month: -; Downloads are not tracked for this model. How to track

Inference Providers NEW

This model is not currently available via any of the supported Inference Providers.

The model cannot be deployed to the HF Inference API: The model has no library tag.

Dataset used to train REILX/Llama-3-8B-Instruct-ruozhiba-lora

Collection including REILX/Llama-3-8B-Instruct-ruozhiba-lora

Llama3-SFT

A series of fine-tuned models based on the Llama model • 5 items • Updated Jul 9, 2024