training using pytorch native 10 epoch, batch size 8, block size 512,lr 1e-4 cosine
6f190f4
verified
finnstrom3693
commited on