--- license: mit --- - Model is directly from pytorch. Refer to the python file. To reuse, use .load_state_dict() from the .pth file. Good luck. Training steps: - step 0: train loss 4.2221, val loss 4.2306 - step 500: train loss 1.7526, val loss 1.9053 - step 1000: train loss 1.3949, val loss 1.6050 - step 1500: train loss 1.2625, val loss 1.5219 - step 2000: train loss 1.1860, val loss 1.5046 - step 2500: train loss 1.1254, val loss 1.4972 - step 3000: train loss 1.0694, val loss 1.4849 - step 3500: train loss 1.0211, val loss 1.5048 - step 4000: train loss 0.9643, val loss 1.5160 - step 4500: train loss 0.9121, val loss 1.5396 - step 5000: train loss 0.8673, val loss 1.5552 - step 5500: train loss 0.8052, val loss 1.5988 - step 6000: train loss 0.7611, val loss 1.6231 - step 6500: train loss 0.7087, val loss 1.6706 - step 7000: train loss 0.6644, val loss 1.7000 - step 7500: train loss 0.6187, val loss 1.7484 - step 8000: train loss 0.5818, val loss 1.7882 - step 8500: train loss 0.5350, val loss 1.8304 - step 9000: train loss 0.4973, val loss 1.8688 - step 9500: train loss 0.4638, val loss 1.9050 - step 9999: train loss 0.4333, val loss 1.9475 ---