dat

Saving weights and logs at step 500

5551a58 over 3 years ago

5.01 kB

	[21:04:06] - INFO - absl - A polynomial schedule was set with a non-positive `transition_steps` value; this results in a constant schedule with value `init_value`.
	/home/dat/pino/lib/python3.8/site-packages/jax/_src/numpy/lax_numpy.py:3132: UserWarning: Explicitly requested dtype <class 'jax._src.numpy.lax_numpy.int64'> requested in zeros is not available, and will be truncated to dtype int32. To enable more dtypes, set the jax_enable_x64 configuration option or the JAX_ENABLE_X64 shell environment variable. See https://github.com/google/jax#current-gotchas for more.
	lax._check_user_dtype_supported(dtype, "zeros")
	/home/dat/pino/lib/python3.8/site-packages/jax/lib/xla_bridge.py:386: UserWarning: jax.host_count has been renamed to jax.process_count. This alias will eventually be removed; please update your code.
	warnings.warn(
	/home/dat/pino/lib/python3.8/site-packages/jax/lib/xla_bridge.py:373: UserWarning: jax.host_id has been renamed to jax.process_index. This alias will eventually be removed; please update your code.
	warnings.warn(
	Epoch ... (1/3): 0%\| \| 0/3 [00:00<?, ?it/s][21:04:07] - INFO - __main__ - Skipping to epoch 0 step 0















































































	Training...: 20%\|█████████████████████████▌ \| 250/1250 [03:59<12:00, 1.39it/s]















































































	Training...: 40%\|███████████████████████████████████████████████████▏ \| 500/1250 [06:59<09:01, 1.39it/s]
	Training...: 40%\|███████████████████████████████████████████████████▏ \| 500/1250 [07:23<09:01, 1.39it/s]


















































































	Training...: 60%\|█████████████████████████████████████████████████████████████████████████████ \| 753/1250 [10:50<31:47, 3.84s/it]














































































	Training...: 80%\|█████████████████████████████████████████████████████████████████████████████████████████████████████▌ \| 1000/1250 [13:50<03:00, 1.39it/s]
	Evaluating ...: 6%\|████████▎ \| 2/31 [00:00<00:02, 11.32it/s]

















































































	Step... (1000 \| Loss: 9.379836082458496, Acc: 0.047341905534267426): 33%\|█████████████████████████▎ \| 1/3 [16:39<33:18, 999.25s/it]

	Training...: 0%\|▏ \| 2/1250 [00:02<02:18, 9.03it/s]
	[21:21:14] - INFO - huggingface_hub.repository - git version 2.25.1 \| 2/1250 [00:02<02:18, 9.03it/s]
	git-lfs/2.9.2 (GitHub; linux amd64; go 1.13.5)
	[21:21:14] - DEBUG - huggingface_hub.repository - [Repository] is a valid git repo
	[21:22:01] - INFO - huggingface_hub.repository - Uploading LFS objects: 100% (27/27), 1.3 GB \| 56 MB/s, done.
	Training...: 0%\|▏ \| 2/1250 [00:51<9:00:35, 25.99s/it]
	Step... (1000 \| Loss: 9.379836082458496, Acc: 0.047341905534267426): 33%\|█████████████████████████ \| 1/3 [17:55<35:50, 1075.43s/it]
	Traceback (most recent call last):
	File "./run_mlm_flax.py", line 850, in <module>
	save_model_checkpoint(model, training_args.output_dir, state, with_opt=model_args.save_optimizer,
	File "./run_mlm_flax.py", line 461, in save_model_checkpoint
	f.write(to_bytes(state.opt_state))
	NameError: name 'to_bytes' is not defined