new 3B native finetune model, not sure why there are 32 pths, despite only set epoch 20 and step 1000

python train.py --load_model ../pretrained_models/RWKV-4-Raven-3B-v10x-Eng49%-Chn50%-Other1%-20230423-ctx4096.pth \
--wandb chatgal3b --data_file ../data/chatgal3b_text_document --data_type binidx \
--vocab_size 50277 --ctx_len 4096 --accumulate_grad_batches 4 --epoch_steps 1000 \
--epoch_count 20 --epoch_save 1 --micro_bsz 4 --n_layer 32 --n_embd 2560 \
--pre_ffn 0 --head_qk 0 --lr_init 1e-5 --lr_final 1e-5 --warmup_steps 50 \
--beta1 0.9 --beta2 0.999 --adam_eps 1e-8 --accelerator gpu --devices 4 \
--precision bf16 --strategy deepspeed_stage_2_offload --grad_cp 1

Files changed (1) hide show

RWKV-4-Raven-3B-v10x-Eng49%-Chn50%-Other1%-20230423-ctx4096-chatgal-epoch20-step1000.pth +3 -0

RWKV-4-Raven-3B-v10x-Eng49%-Chn50%-Other1%-20230423-ctx4096-chatgal-epoch20-step1000.pth ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:9d5d8a9c0759d4cc1a143250fac7cbd21ddbe78039451fa2b51a8f09563e33c1
+size 5969345059