[Reproducing] Stanford Alpaca: An Instruction-following LLaMA Model

This is the repo for reproducing Stanford Alpaca : An Instruction-following LLaMA Model. We finetune some of LlaMa2-based large language model using medical QA dataset. The repo contains:

The 5K data conversations between patients and physicians used for fine-tuning the model.
The code for Preparation data.
The code for Fine Tuning the Model.
The link for Testing the Model.

Dataset

We using the 5k generated dataset by Chat Doctor. The dataset is a generated conversations between patients and physicians from ChatGPT GenMedGPT-5k and disease database. Dataset also currated and modified to Indonesian Language Based.

GenMedGPT-5k-id.json contains 5K instruction-following data we used for fine-tuning the LlaMa model. This JSON file is a list of dictionaries, each dictionary contains the following fields:

instruction: str, describes the task the model should perform. Each of the 52K instructions is unique.
input: str, optional context or input for the task. For example, when the instruction is "Summarize the following article", the input is the article. Around 40% of the examples have an input.
output: str, the answer to the instruction as generated by text-davinci-003.

If you're interested in fine-tuning with your own data, it's essential to adhere to the default prompt format that the model used during its pre-training phase. The prompt for LlaMa 2 is structured similarly to this:

<s>[INST] <<SYS>>
{{ instruction }}
<</SYS>>

{{ input }} [/INST] {{ output }} </s>

Meanwhile, the prompt for PolyLM and InternLM (adapted to Indonesian) is structured similarly to this:

Di bawah ini adalah instruksi yang menjelaskan tugas, dipasangkan dengan masukan yang memberikan konteks lebih lanjut. Tulis tanggapan yang melengkapi permintaan dengan tepat.

Instruksi:
{instruction}

Masukan:
{input}

Tanggapan:
{output}

Finetuning the Model

We fine-tune our models based on the step from Stanford Alpaca. We choose to train some LLama-based model. The model that we finetune are PolyLM-1.7B, LlaMa-2-7B, InternLM-7B with the following hyperparameters:

Hyperparameter	PolyLM-1.7B	LLaMA-7B	InternLM-7B
Batch size	128	128	128
Learning rate	3e-4	3e-4	3e-4
Epochs	3	3	3
Max length	256	256	256
Weight decay	0	0	0

To reproduce our fine-tuning runs for LLaMA, first install the requirements

pip install -r requirements.txt

The code for finetuning is available at fine-tuning.ipynb with four sections of pre-preocessing data, fine-tuning with LlaMa 2, fine-tuning with PolyLM, and fine-tuning with InternLM.

Training procedure

The following bitsandbytes quantization config was used during training:

quant_method: bitsandbytes
load_in_8bit: True
load_in_4bit: False
llm_int8_threshold: 6.0
llm_int8_skip_modules: None
llm_int8_enable_fp32_cpu_offload: False
llm_int8_has_fp16_weight: False
bnb_4bit_quant_type: fp4
bnb_4bit_use_double_quant: False
bnb_4bit_compute_dtype: float32

Framework versions

PEFT 0.6.0.dev0

Testing the Model

These are link for test the fine-tuned model :

Authors

All interns below contributed equally and the order is determined by random draw.

All advised by Firqa Aqilla Noor Arasyi

fadliaulawi
/

polylm-1.7b-finetuned