lorinma
/

yi6B_Vicuna

Text Generation

text-generation-inference

Inference Endpoints

Model card Files Files and versions Community

yi6B_Vicuna / README.md

lorinma's picture

Update README.md

54d0bf8 11 months ago

|

2.55 kB

	---
	datasets:
	- anon8231489123/ShareGPT_Vicuna_unfiltered
	language:
	- zh
	- en
	---

	*TODO:Upload pending, training is finished. still testing.
	*Update: Having a bit issue with the tokenizer, still figuring things out.


	Reproduce Vicuna, but based on yi-6B. The training data I used was ShareGPT_V3_unfiltered_cleaned_split_no_imsorry.json.

	Hyper parameters:
	```
	CUDA_VISIBLE_DEVICES=0,1,2,3,5 torchrun --nproc_per_node 5 ../supervised_finetuning.py \
	--model_type auto \
	--model_name_or_path /data/llm/models/Pretrained/yi-6B/01ai/Yi-6B \
	--tokenizer_name_or_path /data/llm/models/Pretrained/yi-6B/01ai/Yi-6B \
	--train_file_dir ../data/finetune/vicuna/ \
	--per_device_train_batch_size 2\
	--do_train \
	--max_train_samples -1 \
	--num_train_epochs 3 \
	--learning_rate 2e-5 \
	--weight_decay 0. \
	--bf16 \
	--use_peft False \
	--logging_strategy steps \
	--logging_steps 10 \
	--save_strategy epoch \
	--save_total_limit 5 \
	--gradient_accumulation_steps 1 \
	--preprocessing_num_workers 8 \
	--output_dir ../outputs/20240106_yi6B_vicuna \
	--overwrite_output_dir \
	--ddp_timeout 30000 \
	--logging_first_step True \
	--torch_dtype bfloat16 \
	--device_map auto \
	--report_to tensorboard \
	--ddp_find_unused_parameters False \
	--gradient_checkpointing True \
	--cache_dir ./cache \
	--model_max_length 4096 \
	--deepspeed ../deepspeed_zero_stage2_config_no16.json \
	--template_name yi
	```

	The training used 5*A800 for 3 epochs
	```
	*** train metrics ***
	epoch = 3.0
	train_loss = 0.3785
	train_runtime = 1 day, 10:01:13.95
	train_samples = 93204
	train_samples_per_second = 2.24
	train_steps_per_second = 0.224
	```

	We can see from some preliminary results, the conversation is natural and informative (unsurprisingly).

	![image/png](https://cdn-uploads.huggingface.co/production/uploads/6413d7be996b2e426f230fb7/WfQYyyLxtXA2KlePmIPQJ.png)

	Also we observe the unfiltering seems to be working! Heads up some examples are unsafe and inappropriate, this is entirely for research purposes, to test how alignment-filtered SFT data affect LLM's final output.

	![image/png](https://cdn-uploads.huggingface.co/production/uploads/6413d7be996b2e426f230fb7/pklSsljCRN34QuL2ZF2zU.png)

	![image/png](https://cdn-uploads.huggingface.co/production/uploads/6413d7be996b2e426f230fb7/22pTSVkBCVlQ5N8A8JBkF.png)