gptesla-small / log /debug_0.log
shng2025's picture
step 100
caf9820
raw
history blame
122 kB
07/24/2024 16:20:48 - INFO - __main__ - Distributed environment: MULTI_GPU Backend: nccl
Num processes: 4
Process index: 0
Local process index: 0
Device: cuda:0
Mixed precision type: fp16
07/24/2024 16:22:30 - INFO - __main__ - Distributed environment: MULTI_GPU Backend: nccl
Num processes: 4
Process index: 0
Local process index: 0
Device: cuda:0
Mixed precision type: fp16
07/24/2024 16:22:31 - WARNING - huggingface_hub.repository - /dli/gptesla-small/./ is already a clone of https://huggingface.co/shng2025/gptesla-small. Make sure you pull the latest changes with `repo.git_pull()`.
07/24/2024 16:22:31 - WARNING - huggingface_hub.repository - Revision `dashing-violet-117` does not exist. Created and checked out branch `dashing-violet-117`.
07/24/2024 16:22:31 - WARNING - huggingface_hub.repository -
07/24/2024 16:22:32 - DEBUG - datasets.utils._dataset_viewer - Dataset info for shng2025/gptesla-train is not completely ready yet.
07/24/2024 16:22:32 - INFO - datasets.builder - No config specified, defaulting to the single config: gptesla-train/default
07/24/2024 16:22:32 - INFO - datasets.info - Loading Dataset Infos from /usr/local/lib/python3.10/dist-packages/datasets/packaged_modules/json
07/24/2024 16:22:38 - DEBUG - datasets.iterable_dataset - dataloader worker#0, ': Starting to iterate over 2/183 shards.
07/24/2024 16:22:38 - DEBUG - datasets.iterable_dataset - dataloader worker#1, ': Starting to iterate over 2/183 shards.
07/24/2024 16:22:38 - DEBUG - datasets.iterable_dataset - dataloader worker#2, ': Starting to iterate over 2/183 shards.
07/24/2024 16:22:38 - DEBUG - datasets.iterable_dataset - dataloader worker#4, ': Starting to iterate over 2/183 shards.
07/24/2024 16:22:38 - DEBUG - datasets.iterable_dataset - dataloader worker#3, ': Starting to iterate over 2/183 shards.
07/24/2024 16:22:38 - DEBUG - datasets.iterable_dataset - dataloader worker#5, ': Starting to iterate over 2/183 shards.
07/24/2024 16:22:38 - DEBUG - datasets.iterable_dataset - dataloader worker#6, ': Starting to iterate over 2/183 shards.
07/24/2024 16:22:38 - DEBUG - datasets.iterable_dataset - dataloader worker#7, ': Starting to iterate over 2/183 shards.
07/24/2024 16:22:38 - DEBUG - datasets.iterable_dataset - dataloader worker#8, ': Starting to iterate over 2/183 shards.
07/24/2024 16:22:38 - DEBUG - datasets.iterable_dataset - dataloader worker#12, ': Starting to iterate over 2/183 shards.
07/24/2024 16:22:38 - DEBUG - datasets.iterable_dataset - dataloader worker#21, ': Starting to iterate over 2/183 shards.
07/24/2024 16:22:38 - DEBUG - datasets.iterable_dataset - dataloader worker#10, ': Starting to iterate over 2/183 shards.
07/24/2024 16:22:38 - DEBUG - datasets.iterable_dataset - dataloader worker#13, ': Starting to iterate over 2/183 shards.
07/24/2024 16:22:38 - DEBUG - datasets.iterable_dataset - dataloader worker#19, ': Starting to iterate over 2/183 shards.
07/24/2024 16:22:38 - DEBUG - datasets.iterable_dataset - dataloader worker#11, ': Starting to iterate over 2/183 shards.
07/24/2024 16:22:38 - DEBUG - datasets.iterable_dataset - dataloader worker#22, ': Starting to iterate over 2/183 shards.
07/24/2024 16:22:38 - DEBUG - datasets.iterable_dataset - dataloader worker#15, ': Starting to iterate over 2/183 shards.
07/24/2024 16:22:38 - DEBUG - datasets.iterable_dataset - dataloader worker#24, ': Starting to iterate over 2/183 shards.
07/24/2024 16:22:38 - DEBUG - datasets.iterable_dataset - dataloader worker#23, ': Starting to iterate over 2/183 shards.
07/24/2024 16:22:38 - DEBUG - datasets.iterable_dataset - dataloader worker#9, ': Starting to iterate over 2/183 shards.
07/24/2024 16:22:38 - DEBUG - datasets.iterable_dataset - dataloader worker#17, ': Starting to iterate over 2/183 shards.
07/24/2024 16:22:38 - DEBUG - datasets.iterable_dataset - dataloader worker#18, ': Starting to iterate over 2/183 shards.
07/24/2024 16:22:38 - DEBUG - datasets.iterable_dataset - dataloader worker#14, ': Starting to iterate over 2/183 shards.
07/24/2024 16:22:38 - DEBUG - datasets.iterable_dataset - dataloader worker#16, ': Starting to iterate over 2/183 shards.
07/24/2024 16:22:38 - DEBUG - datasets.iterable_dataset - dataloader worker#20, ': Starting to iterate over 2/183 shards.
07/24/2024 16:22:38 - DEBUG - datasets.iterable_dataset - dataloader worker#26, ': Starting to iterate over 2/183 shards.
07/24/2024 16:22:38 - DEBUG - datasets.iterable_dataset - dataloader worker#28, ': Starting to iterate over 2/183 shards.
07/24/2024 16:22:38 - DEBUG - datasets.iterable_dataset - dataloader worker#27, ': Starting to iterate over 2/183 shards.
07/24/2024 16:22:38 - DEBUG - datasets.iterable_dataset - dataloader worker#25, ': Starting to iterate over 2/183 shards.
07/24/2024 16:22:38 - DEBUG - datasets.iterable_dataset - dataloader worker#30, ': Starting to iterate over 2/183 shards.
07/24/2024 16:22:38 - DEBUG - datasets.iterable_dataset - dataloader worker#29, ': Starting to iterate over 2/183 shards.
07/24/2024 16:22:38 - DEBUG - datasets.iterable_dataset - dataloader worker#31, ': Starting to iterate over 2/183 shards.
07/24/2024 16:22:38 - DEBUG - datasets.iterable_dataset - dataloader worker#32, ': Starting to iterate over 2/183 shards.
07/24/2024 16:22:38 - DEBUG - datasets.iterable_dataset - dataloader worker#33, ': Starting to iterate over 2/183 shards.
07/24/2024 16:22:38 - DEBUG - datasets.iterable_dataset - dataloader worker#34, ': Starting to iterate over 2/183 shards.
07/24/2024 16:22:38 - DEBUG - datasets.iterable_dataset - dataloader worker#35, ': Starting to iterate over 2/183 shards.
07/24/2024 16:22:38 - DEBUG - datasets.iterable_dataset - dataloader worker#36, ': Starting to iterate over 2/183 shards.
07/24/2024 16:22:38 - DEBUG - datasets.iterable_dataset - dataloader worker#37, ': Starting to iterate over 2/183 shards.
07/24/2024 16:22:38 - DEBUG - datasets.iterable_dataset - dataloader worker#38, ': Starting to iterate over 2/183 shards.
07/24/2024 16:22:38 - DEBUG - datasets.iterable_dataset - dataloader worker#39, ': Starting to iterate over 2/183 shards.
07/24/2024 16:22:38 - DEBUG - datasets.iterable_dataset - dataloader worker#40, ': Starting to iterate over 2/183 shards.
07/24/2024 16:22:38 - DEBUG - datasets.iterable_dataset - dataloader worker#41, ': Starting to iterate over 2/183 shards.
07/24/2024 16:22:38 - DEBUG - datasets.iterable_dataset - dataloader worker#42, ': Starting to iterate over 2/183 shards.
07/24/2024 16:22:38 - DEBUG - datasets.iterable_dataset - dataloader worker#43, ': Starting to iterate over 2/183 shards.
07/24/2024 16:22:38 - DEBUG - datasets.iterable_dataset - dataloader worker#44, ': Starting to iterate over 2/183 shards.
07/24/2024 16:22:38 - DEBUG - datasets.iterable_dataset - dataloader worker#45, ': Starting to iterate over 2/183 shards.
07/24/2024 16:22:38 - DEBUG - datasets.iterable_dataset - dataloader worker#46, ': Starting to iterate over 2/183 shards.
07/24/2024 16:22:38 - DEBUG - datasets.iterable_dataset - dataloader worker#47, ': Starting to iterate over 2/183 shards.
07/24/2024 16:22:38 - DEBUG - datasets.iterable_dataset - dataloader worker#48, ': Starting to iterate over 2/183 shards.
07/24/2024 16:22:38 - DEBUG - datasets.iterable_dataset - dataloader worker#49, ': Starting to iterate over 2/183 shards.
07/24/2024 16:22:38 - DEBUG - datasets.iterable_dataset - dataloader worker#51, ': Starting to iterate over 2/183 shards.
07/24/2024 16:22:38 - DEBUG - datasets.iterable_dataset - dataloader worker#50, ': Starting to iterate over 2/183 shards.
07/24/2024 16:22:38 - DEBUG - datasets.iterable_dataset - dataloader worker#52, ': Starting to iterate over 2/183 shards.
07/24/2024 16:22:38 - DEBUG - datasets.iterable_dataset - dataloader worker#53, ': Starting to iterate over 2/183 shards.
07/24/2024 16:22:38 - DEBUG - datasets.iterable_dataset - dataloader worker#54, ': Starting to iterate over 2/183 shards.
07/24/2024 16:22:38 - DEBUG - datasets.iterable_dataset - dataloader worker#56, ': Starting to iterate over 2/183 shards.
07/24/2024 16:22:39 - DEBUG - datasets.iterable_dataset - dataloader worker#55, ': Starting to iterate over 2/183 shards.
07/24/2024 16:22:39 - DEBUG - datasets.iterable_dataset - dataloader worker#57, ': Starting to iterate over 2/183 shards.
07/24/2024 16:22:39 - DEBUG - datasets.iterable_dataset - dataloader worker#58, ': Starting to iterate over 2/183 shards.
07/24/2024 16:22:39 - DEBUG - datasets.iterable_dataset - dataloader worker#60, ': Starting to iterate over 2/183 shards.
07/24/2024 16:22:39 - DEBUG - datasets.iterable_dataset - dataloader worker#61, ': Starting to iterate over 2/183 shards.
07/24/2024 16:22:39 - DEBUG - datasets.iterable_dataset - dataloader worker#62, ': Starting to iterate over 2/183 shards.
07/24/2024 16:22:39 - DEBUG - datasets.iterable_dataset - dataloader worker#63, ': Starting to iterate over 2/183 shards.
07/24/2024 16:22:39 - DEBUG - datasets.iterable_dataset - dataloader worker#59, ': Starting to iterate over 2/183 shards.
07/24/2024 16:22:39 - DEBUG - datasets.iterable_dataset - dataloader worker#64, ': Starting to iterate over 2/183 shards.
07/24/2024 16:22:39 - DEBUG - datasets.iterable_dataset - dataloader worker#65, ': Starting to iterate over 2/183 shards.
07/24/2024 16:22:39 - DEBUG - datasets.iterable_dataset - dataloader worker#66, ': Starting to iterate over 2/183 shards.
07/24/2024 16:22:39 - DEBUG - datasets.iterable_dataset - dataloader worker#67, ': Starting to iterate over 2/183 shards.
07/24/2024 16:22:39 - DEBUG - datasets.iterable_dataset - dataloader worker#68, ': Starting to iterate over 2/183 shards.
07/24/2024 16:22:39 - DEBUG - datasets.iterable_dataset - dataloader worker#69, ': Starting to iterate over 2/183 shards.
07/24/2024 16:22:39 - DEBUG - datasets.iterable_dataset - dataloader worker#70, ': Starting to iterate over 2/183 shards.
07/24/2024 16:22:39 - DEBUG - datasets.iterable_dataset - dataloader worker#71, ': Starting to iterate over 2/183 shards.
07/24/2024 16:22:39 - DEBUG - datasets.iterable_dataset - dataloader worker#72, ': Starting to iterate over 2/183 shards.
07/24/2024 16:22:39 - DEBUG - datasets.iterable_dataset - dataloader worker#73, ': Starting to iterate over 2/183 shards.
07/24/2024 16:22:39 - DEBUG - datasets.iterable_dataset - dataloader worker#74, ': Starting to iterate over 2/183 shards.
07/24/2024 16:22:39 - DEBUG - datasets.iterable_dataset - dataloader worker#75, ': Starting to iterate over 2/183 shards.
07/24/2024 16:22:39 - DEBUG - datasets.iterable_dataset - dataloader worker#76, ': Starting to iterate over 2/183 shards.
07/24/2024 16:22:39 - DEBUG - datasets.iterable_dataset - dataloader worker#77, ': Starting to iterate over 2/183 shards.
07/24/2024 16:22:39 - DEBUG - datasets.iterable_dataset - dataloader worker#78, ': Starting to iterate over 2/183 shards.
07/24/2024 16:22:39 - DEBUG - datasets.iterable_dataset - dataloader worker#80, ': Starting to iterate over 2/183 shards.
07/24/2024 16:22:39 - DEBUG - datasets.iterable_dataset - dataloader worker#79, ': Starting to iterate over 2/183 shards.
07/24/2024 16:22:39 - DEBUG - datasets.iterable_dataset - dataloader worker#81, ': Starting to iterate over 2/183 shards.
07/24/2024 16:22:39 - DEBUG - datasets.iterable_dataset - dataloader worker#82, ': Starting to iterate over 2/183 shards.
07/24/2024 16:22:39 - DEBUG - datasets.iterable_dataset - dataloader worker#83, ': Starting to iterate over 2/183 shards.
07/24/2024 16:22:39 - DEBUG - datasets.iterable_dataset - dataloader worker#84, ': Starting to iterate over 2/183 shards.
07/24/2024 16:22:39 - DEBUG - datasets.iterable_dataset - dataloader worker#85, ': Starting to iterate over 2/183 shards.
07/24/2024 16:22:39 - DEBUG - datasets.iterable_dataset - dataloader worker#86, ': Starting to iterate over 2/183 shards.
07/24/2024 16:22:39 - DEBUG - datasets.iterable_dataset - dataloader worker#88, ': Starting to iterate over 1/183 shards.
07/24/2024 16:22:39 - DEBUG - datasets.iterable_dataset - dataloader worker#87, ': Starting to iterate over 1/183 shards.
07/24/2024 16:22:39 - DEBUG - datasets.iterable_dataset - dataloader worker#89, ': Starting to iterate over 1/183 shards.
07/24/2024 16:22:39 - DEBUG - datasets.iterable_dataset - dataloader worker#91, ': Starting to iterate over 1/183 shards.
07/24/2024 16:22:39 - DEBUG - datasets.iterable_dataset - dataloader worker#90, ': Starting to iterate over 1/183 shards.
07/24/2024 16:22:39 - DEBUG - datasets.iterable_dataset - dataloader worker#92, ': Starting to iterate over 1/183 shards.
07/24/2024 16:22:39 - DEBUG - datasets.iterable_dataset - dataloader worker#93, ': Starting to iterate over 1/183 shards.
07/24/2024 16:22:39 - DEBUG - datasets.iterable_dataset - dataloader worker#94, ': Starting to iterate over 1/183 shards.
07/24/2024 16:22:39 - DEBUG - datasets.iterable_dataset - dataloader worker#95, ': Starting to iterate over 1/183 shards.
07/24/2024 16:22:39 - DEBUG - datasets.packaged_modules.json.json - Batch of 10486023 bytes couldn't be parsed with block_size=327680. Retrying with block_size=655360.
07/24/2024 16:22:39 - DEBUG - datasets.packaged_modules.json.json - Batch of 10486023 bytes couldn't be parsed with block_size=655360. Retrying with block_size=1310720.
07/24/2024 16:22:39 - DEBUG - datasets.packaged_modules.json.json - Batch of 10497218 bytes couldn't be parsed with block_size=327680. Retrying with block_size=655360.
07/24/2024 16:22:39 - DEBUG - datasets.packaged_modules.json.json - Batch of 10485912 bytes couldn't be parsed with block_size=327680. Retrying with block_size=655360.
07/24/2024 16:22:39 - DEBUG - datasets.packaged_modules.json.json - Batch of 10485912 bytes couldn't be parsed with block_size=655360. Retrying with block_size=1310720.
07/24/2024 16:22:39 - DEBUG - datasets.packaged_modules.json.json - Batch of 10668116 bytes couldn't be parsed with block_size=327680. Retrying with block_size=655360.
07/24/2024 16:22:39 - DEBUG - datasets.packaged_modules.json.json - Batch of 10485847 bytes couldn't be parsed with block_size=327680. Retrying with block_size=655360.
07/24/2024 16:22:39 - DEBUG - datasets.packaged_modules.json.json - Batch of 10512203 bytes couldn't be parsed with block_size=327680. Retrying with block_size=655360.
07/24/2024 16:22:39 - DEBUG - datasets.packaged_modules.json.json - Batch of 10495973 bytes couldn't be parsed with block_size=327680. Retrying with block_size=655360.
07/24/2024 16:22:39 - DEBUG - datasets.packaged_modules.json.json - Batch of 10488098 bytes couldn't be parsed with block_size=327680. Retrying with block_size=655360.
07/24/2024 16:22:39 - DEBUG - datasets.packaged_modules.json.json - Batch of 10485842 bytes couldn't be parsed with block_size=327680. Retrying with block_size=655360.
07/24/2024 16:22:39 - DEBUG - datasets.packaged_modules.json.json - Batch of 10487725 bytes couldn't be parsed with block_size=327680. Retrying with block_size=655360.
07/24/2024 16:22:39 - DEBUG - datasets.packaged_modules.json.json - Batch of 10486172 bytes couldn't be parsed with block_size=327680. Retrying with block_size=655360.
07/24/2024 16:22:39 - DEBUG - datasets.packaged_modules.json.json - Batch of 10522596 bytes couldn't be parsed with block_size=327680. Retrying with block_size=655360.
07/24/2024 16:22:39 - DEBUG - datasets.packaged_modules.json.json - Batch of 10949076 bytes couldn't be parsed with block_size=327680. Retrying with block_size=655360.
07/24/2024 16:22:39 - DEBUG - datasets.packaged_modules.json.json - Batch of 10488651 bytes couldn't be parsed with block_size=327680. Retrying with block_size=655360.
07/24/2024 16:22:39 - DEBUG - datasets.packaged_modules.json.json - Batch of 10536479 bytes couldn't be parsed with block_size=327680. Retrying with block_size=655360.
07/24/2024 16:22:39 - DEBUG - datasets.packaged_modules.json.json - Batch of 10489575 bytes couldn't be parsed with block_size=327680. Retrying with block_size=655360.
07/24/2024 16:22:39 - DEBUG - datasets.packaged_modules.json.json - Batch of 10485918 bytes couldn't be parsed with block_size=327680. Retrying with block_size=655360.
07/24/2024 16:22:39 - DEBUG - datasets.packaged_modules.json.json - Batch of 10488651 bytes couldn't be parsed with block_size=655360. Retrying with block_size=1310720.
07/24/2024 16:22:39 - DEBUG - datasets.packaged_modules.json.json - Batch of 10610581 bytes couldn't be parsed with block_size=327680. Retrying with block_size=655360.
07/24/2024 16:22:39 - DEBUG - datasets.packaged_modules.json.json - Batch of 10511500 bytes couldn't be parsed with block_size=327680. Retrying with block_size=655360.
07/24/2024 16:22:39 - DEBUG - datasets.packaged_modules.json.json - Batch of 10486276 bytes couldn't be parsed with block_size=327680. Retrying with block_size=655360.
07/24/2024 16:22:39 - DEBUG - datasets.packaged_modules.json.json - Batch of 10488385 bytes couldn't be parsed with block_size=327680. Retrying with block_size=655360.
07/24/2024 16:22:39 - DEBUG - datasets.packaged_modules.json.json - Batch of 10487790 bytes couldn't be parsed with block_size=327680. Retrying with block_size=655360.
07/24/2024 16:22:39 - DEBUG - datasets.packaged_modules.json.json - Batch of 10509262 bytes couldn't be parsed with block_size=655360. Retrying with block_size=1310720.
07/24/2024 16:22:39 - DEBUG - datasets.packaged_modules.json.json - Batch of 10492861 bytes couldn't be parsed with block_size=327680. Retrying with block_size=655360.
07/24/2024 16:22:39 - DEBUG - datasets.packaged_modules.json.json - Batch of 10492861 bytes couldn't be parsed with block_size=655360. Retrying with block_size=1310720.
07/24/2024 16:22:39 - DEBUG - datasets.packaged_modules.json.json - Batch of 10498167 bytes couldn't be parsed with block_size=327680. Retrying with block_size=655360.
07/24/2024 16:22:39 - DEBUG - datasets.packaged_modules.json.json - Batch of 10491327 bytes couldn't be parsed with block_size=327680. Retrying with block_size=655360.
07/24/2024 16:22:39 - DEBUG - datasets.packaged_modules.json.json - Batch of 10525688 bytes couldn't be parsed with block_size=327680. Retrying with block_size=655360.
07/24/2024 16:22:39 - DEBUG - datasets.packaged_modules.json.json - Batch of 10553677 bytes couldn't be parsed with block_size=327680. Retrying with block_size=655360.
07/24/2024 16:22:39 - DEBUG - datasets.packaged_modules.json.json - Batch of 10489599 bytes couldn't be parsed with block_size=327680. Retrying with block_size=655360.
07/24/2024 16:22:39 - DEBUG - datasets.packaged_modules.json.json - Batch of 11115863 bytes couldn't be parsed with block_size=327680. Retrying with block_size=655360.
07/24/2024 16:22:39 - DEBUG - datasets.packaged_modules.json.json - Batch of 10486801 bytes couldn't be parsed with block_size=327680. Retrying with block_size=655360.
07/24/2024 16:22:39 - DEBUG - datasets.packaged_modules.json.json - Batch of 10640425 bytes couldn't be parsed with block_size=327680. Retrying with block_size=655360.
07/24/2024 16:22:39 - DEBUG - datasets.packaged_modules.json.json - Batch of 10487482 bytes couldn't be parsed with block_size=327680. Retrying with block_size=655360.
07/24/2024 16:22:39 - DEBUG - datasets.packaged_modules.json.json - Batch of 10530453 bytes couldn't be parsed with block_size=655360. Retrying with block_size=1310720.
07/24/2024 16:22:39 - DEBUG - datasets.packaged_modules.json.json - Batch of 10492277 bytes couldn't be parsed with block_size=327680. Retrying with block_size=655360.
07/24/2024 16:22:39 - DEBUG - datasets.packaged_modules.json.json - Batch of 10553677 bytes couldn't be parsed with block_size=655360. Retrying with block_size=1310720.
07/24/2024 16:22:39 - DEBUG - datasets.packaged_modules.json.json - Batch of 10863935 bytes couldn't be parsed with block_size=327680. Retrying with block_size=655360.
07/24/2024 16:22:39 - DEBUG - datasets.packaged_modules.json.json - Batch of 10686322 bytes couldn't be parsed with block_size=327680. Retrying with block_size=655360.
07/24/2024 16:22:39 - DEBUG - datasets.packaged_modules.json.json - Batch of 10492554 bytes couldn't be parsed with block_size=327680. Retrying with block_size=655360.
07/24/2024 16:22:39 - DEBUG - datasets.packaged_modules.json.json - Batch of 10493913 bytes couldn't be parsed with block_size=327680. Retrying with block_size=655360.
07/24/2024 16:22:39 - DEBUG - datasets.packaged_modules.json.json - Batch of 10489635 bytes couldn't be parsed with block_size=327680. Retrying with block_size=655360.
07/24/2024 16:22:40 - DEBUG - datasets.packaged_modules.json.json - Batch of 10500290 bytes couldn't be parsed with block_size=327680. Retrying with block_size=655360.
07/24/2024 16:22:40 - DEBUG - datasets.packaged_modules.json.json - Batch of 10495520 bytes couldn't be parsed with block_size=327680. Retrying with block_size=655360.
07/24/2024 16:22:40 - DEBUG - datasets.packaged_modules.json.json - Batch of 10497335 bytes couldn't be parsed with block_size=327680. Retrying with block_size=655360.
07/24/2024 16:22:40 - DEBUG - datasets.packaged_modules.json.json - Batch of 10486397 bytes couldn't be parsed with block_size=327680. Retrying with block_size=655360.
07/24/2024 16:22:40 - DEBUG - datasets.packaged_modules.json.json - Batch of 10676628 bytes couldn't be parsed with block_size=327680. Retrying with block_size=655360.
07/24/2024 16:22:40 - DEBUG - datasets.packaged_modules.json.json - Batch of 10495520 bytes couldn't be parsed with block_size=655360. Retrying with block_size=1310720.
07/24/2024 16:22:40 - DEBUG - datasets.packaged_modules.json.json - Batch of 11286262 bytes couldn't be parsed with block_size=327680. Retrying with block_size=655360.
07/24/2024 16:22:40 - DEBUG - datasets.packaged_modules.json.json - Batch of 10751338 bytes couldn't be parsed with block_size=327680. Retrying with block_size=655360.
07/24/2024 16:22:40 - DEBUG - datasets.packaged_modules.json.json - Batch of 11286262 bytes couldn't be parsed with block_size=655360. Retrying with block_size=1310720.
07/24/2024 16:22:40 - DEBUG - datasets.packaged_modules.json.json - Batch of 10491889 bytes couldn't be parsed with block_size=655360. Retrying with block_size=1310720.
07/24/2024 16:22:40 - DEBUG - datasets.packaged_modules.json.json - Batch of 10621496 bytes couldn't be parsed with block_size=327680. Retrying with block_size=655360.
07/24/2024 16:22:40 - DEBUG - datasets.packaged_modules.json.json - Batch of 10500930 bytes couldn't be parsed with block_size=327680. Retrying with block_size=655360.
07/24/2024 16:22:40 - DEBUG - datasets.packaged_modules.json.json - Batch of 10487097 bytes couldn't be parsed with block_size=327680. Retrying with block_size=655360.
07/24/2024 16:22:40 - DEBUG - datasets.packaged_modules.json.json - Batch of 10598254 bytes couldn't be parsed with block_size=327680. Retrying with block_size=655360.
07/24/2024 16:22:40 - DEBUG - datasets.packaged_modules.json.json - Batch of 10497111 bytes couldn't be parsed with block_size=327680. Retrying with block_size=655360.
07/24/2024 16:22:40 - DEBUG - datasets.packaged_modules.json.json - Batch of 10486616 bytes couldn't be parsed with block_size=327680. Retrying with block_size=655360.
07/24/2024 16:22:40 - DEBUG - datasets.packaged_modules.json.json - Batch of 10497062 bytes couldn't be parsed with block_size=327680. Retrying with block_size=655360.
07/24/2024 16:22:40 - DEBUG - datasets.packaged_modules.json.json - Batch of 10497111 bytes couldn't be parsed with block_size=655360. Retrying with block_size=1310720.
07/24/2024 16:22:40 - DEBUG - datasets.packaged_modules.json.json - Batch of 10486616 bytes couldn't be parsed with block_size=655360. Retrying with block_size=1310720.
07/24/2024 16:22:40 - DEBUG - datasets.packaged_modules.json.json - Batch of 10515063 bytes couldn't be parsed with block_size=327680. Retrying with block_size=655360.
07/24/2024 16:22:40 - DEBUG - datasets.packaged_modules.json.json - Batch of 10499607 bytes couldn't be parsed with block_size=327680. Retrying with block_size=655360.
07/24/2024 16:22:40 - DEBUG - datasets.packaged_modules.json.json - Batch of 10511515 bytes couldn't be parsed with block_size=327680. Retrying with block_size=655360.
07/24/2024 16:22:40 - DEBUG - datasets.packaged_modules.json.json - Batch of 10488608 bytes couldn't be parsed with block_size=327680. Retrying with block_size=655360.
07/24/2024 16:22:40 - DEBUG - datasets.packaged_modules.json.json - Batch of 10562022 bytes couldn't be parsed with block_size=327680. Retrying with block_size=655360.
07/24/2024 16:22:40 - DEBUG - datasets.packaged_modules.json.json - Batch of 10491272 bytes couldn't be parsed with block_size=327680. Retrying with block_size=655360.
07/24/2024 16:22:40 - DEBUG - datasets.packaged_modules.json.json - Batch of 10525926 bytes couldn't be parsed with block_size=327680. Retrying with block_size=655360.
07/24/2024 16:22:40 - DEBUG - datasets.packaged_modules.json.json - Batch of 10488150 bytes couldn't be parsed with block_size=327680. Retrying with block_size=655360.
07/24/2024 16:22:40 - DEBUG - datasets.packaged_modules.json.json - Batch of 10499106 bytes couldn't be parsed with block_size=327680. Retrying with block_size=655360.
07/24/2024 16:22:40 - DEBUG - datasets.packaged_modules.json.json - Batch of 10509286 bytes couldn't be parsed with block_size=327680. Retrying with block_size=655360.
07/24/2024 16:22:40 - DEBUG - datasets.packaged_modules.json.json - Batch of 10501535 bytes couldn't be parsed with block_size=327680. Retrying with block_size=655360.
07/24/2024 16:22:41 - DEBUG - datasets.packaged_modules.json.json - Batch of 10509286 bytes couldn't be parsed with block_size=655360. Retrying with block_size=1310720.
07/24/2024 16:22:41 - DEBUG - datasets.packaged_modules.json.json - Batch of 10552417 bytes couldn't be parsed with block_size=327680. Retrying with block_size=655360.
07/24/2024 16:22:41 - DEBUG - datasets.packaged_modules.json.json - Batch of 10491547 bytes couldn't be parsed with block_size=327680. Retrying with block_size=655360.
07/24/2024 16:22:41 - DEBUG - datasets.packaged_modules.json.json - Batch of 10552417 bytes couldn't be parsed with block_size=655360. Retrying with block_size=1310720.
07/24/2024 16:22:41 - DEBUG - datasets.packaged_modules.json.json - Batch of 10511604 bytes couldn't be parsed with block_size=655360. Retrying with block_size=1310720.
07/24/2024 16:22:57 - INFO - __main__ - Step 1: {'lr': 0.0, 'samples': 48, 'steps': 0, 'loss/train': 10.554669380187988}
07/24/2024 16:22:58 - INFO - __main__ - Step 2: {'lr': 7.142857142857143e-07, 'samples': 96, 'steps': 1, 'loss/train': 10.494059562683105}
07/24/2024 16:22:58 - INFO - __main__ - Step 3: {'lr': 1.4285714285714286e-06, 'samples': 144, 'steps': 2, 'loss/train': 10.507988929748535}
07/24/2024 16:22:58 - INFO - __main__ - Step 4: {'lr': 2.142857142857143e-06, 'samples': 192, 'steps': 3, 'loss/train': 10.415447235107422}
07/24/2024 16:22:58 - INFO - __main__ - Step 5: {'lr': 2.8571428571428573e-06, 'samples': 240, 'steps': 4, 'loss/train': 10.345850944519043}
07/24/2024 16:22:59 - INFO - __main__ - Step 6: {'lr': 3.5714285714285714e-06, 'samples': 288, 'steps': 5, 'loss/train': 10.195524215698242}
07/24/2024 16:22:59 - INFO - __main__ - Step 7: {'lr': 4.285714285714286e-06, 'samples': 336, 'steps': 6, 'loss/train': 10.09341812133789}
07/24/2024 16:22:59 - INFO - __main__ - Step 8: {'lr': 5e-06, 'samples': 384, 'steps': 7, 'loss/train': 9.965239524841309}
07/24/2024 16:22:59 - INFO - __main__ - Step 9: {'lr': 5.7142857142857145e-06, 'samples': 432, 'steps': 8, 'loss/train': 9.698853492736816}
07/24/2024 16:23:00 - INFO - __main__ - Step 10: {'lr': 6.428571428571429e-06, 'samples': 480, 'steps': 9, 'loss/train': 9.80683708190918}
07/24/2024 16:23:00 - INFO - __main__ - Step 11: {'lr': 7.142857142857143e-06, 'samples': 528, 'steps': 10, 'loss/train': 9.633079528808594}
07/24/2024 16:23:00 - INFO - __main__ - Step 12: {'lr': 7.857142857142858e-06, 'samples': 576, 'steps': 11, 'loss/train': 9.700591087341309}
07/24/2024 16:23:01 - INFO - __main__ - Step 13: {'lr': 8.571428571428573e-06, 'samples': 624, 'steps': 12, 'loss/train': 9.603139877319336}
07/24/2024 16:23:01 - INFO - __main__ - Step 14: {'lr': 9.285714285714286e-06, 'samples': 672, 'steps': 13, 'loss/train': 9.30308723449707}
07/24/2024 16:23:01 - INFO - __main__ - Step 15: {'lr': 1e-05, 'samples': 720, 'steps': 14, 'loss/train': 9.333526611328125}
07/24/2024 16:23:01 - INFO - __main__ - Step 16: {'lr': 1.0714285714285714e-05, 'samples': 768, 'steps': 15, 'loss/train': 8.336181640625}
07/24/2024 16:23:02 - INFO - __main__ - Step 17: {'lr': 1.1428571428571429e-05, 'samples': 816, 'steps': 16, 'loss/train': 9.075631141662598}
07/24/2024 16:23:02 - INFO - __main__ - Step 18: {'lr': 1.2142857142857142e-05, 'samples': 864, 'steps': 17, 'loss/train': 9.18478012084961}
07/24/2024 16:23:02 - INFO - __main__ - Step 19: {'lr': 1.2857142857142857e-05, 'samples': 912, 'steps': 18, 'loss/train': 8.96328353881836}
07/24/2024 16:23:03 - INFO - __main__ - Step 20: {'lr': 1.3571428571428572e-05, 'samples': 960, 'steps': 19, 'loss/train': 9.45018196105957}
07/24/2024 16:23:03 - INFO - __main__ - Step 21: {'lr': 1.4285714285714285e-05, 'samples': 1008, 'steps': 20, 'loss/train': 8.517333984375}
07/24/2024 16:23:03 - INFO - __main__ - Step 22: {'lr': 1.5e-05, 'samples': 1056, 'steps': 21, 'loss/train': 9.207684516906738}
07/24/2024 16:23:03 - INFO - __main__ - Step 23: {'lr': 1.5714285714285715e-05, 'samples': 1104, 'steps': 22, 'loss/train': 8.681092262268066}
07/24/2024 16:23:04 - INFO - __main__ - Step 24: {'lr': 1.642857142857143e-05, 'samples': 1152, 'steps': 23, 'loss/train': 8.316036224365234}
07/24/2024 16:23:04 - INFO - __main__ - Step 25: {'lr': 1.7142857142857145e-05, 'samples': 1200, 'steps': 24, 'loss/train': 8.944169044494629}
07/24/2024 16:23:04 - INFO - __main__ - Step 26: {'lr': 1.7857142857142855e-05, 'samples': 1248, 'steps': 25, 'loss/train': 8.878201484680176}
07/24/2024 16:23:05 - INFO - __main__ - Step 27: {'lr': 1.8571428571428572e-05, 'samples': 1296, 'steps': 26, 'loss/train': 9.158102989196777}
07/24/2024 16:23:05 - INFO - __main__ - Step 28: {'lr': 1.9285714285714285e-05, 'samples': 1344, 'steps': 27, 'loss/train': 9.14354419708252}
07/24/2024 16:23:05 - INFO - __main__ - Step 29: {'lr': 2e-05, 'samples': 1392, 'steps': 28, 'loss/train': 8.860624313354492}
07/24/2024 16:23:05 - INFO - __main__ - Step 30: {'lr': 2.0714285714285715e-05, 'samples': 1440, 'steps': 29, 'loss/train': 8.876450538635254}
07/24/2024 16:23:06 - INFO - __main__ - Step 31: {'lr': 2.1428571428571428e-05, 'samples': 1488, 'steps': 30, 'loss/train': 8.425738334655762}
07/24/2024 16:23:06 - INFO - __main__ - Step 32: {'lr': 2.214285714285714e-05, 'samples': 1536, 'steps': 31, 'loss/train': 8.942279815673828}
07/24/2024 16:23:06 - INFO - __main__ - Step 33: {'lr': 2.2857142857142858e-05, 'samples': 1584, 'steps': 32, 'loss/train': 8.757084846496582}
07/24/2024 16:23:06 - INFO - __main__ - Step 34: {'lr': 2.3571428571428575e-05, 'samples': 1632, 'steps': 33, 'loss/train': 8.699286460876465}
07/24/2024 16:23:07 - INFO - __main__ - Step 35: {'lr': 2.4285714285714285e-05, 'samples': 1680, 'steps': 34, 'loss/train': 8.857367515563965}
07/24/2024 16:23:07 - INFO - __main__ - Step 36: {'lr': 2.5e-05, 'samples': 1728, 'steps': 35, 'loss/train': 8.830195426940918}
07/24/2024 16:23:07 - INFO - __main__ - Step 37: {'lr': 2.5714285714285714e-05, 'samples': 1776, 'steps': 36, 'loss/train': 8.944982528686523}
07/24/2024 16:23:08 - INFO - __main__ - Step 38: {'lr': 2.642857142857143e-05, 'samples': 1824, 'steps': 37, 'loss/train': 8.670278549194336}
07/24/2024 16:23:08 - INFO - __main__ - Step 39: {'lr': 2.7142857142857144e-05, 'samples': 1872, 'steps': 38, 'loss/train': 8.710525512695312}
07/24/2024 16:23:08 - INFO - __main__ - Step 40: {'lr': 2.7857142857142858e-05, 'samples': 1920, 'steps': 39, 'loss/train': 7.902089595794678}
07/24/2024 16:23:08 - INFO - __main__ - Step 41: {'lr': 2.857142857142857e-05, 'samples': 1968, 'steps': 40, 'loss/train': 8.400484085083008}
07/24/2024 16:23:09 - INFO - __main__ - Step 42: {'lr': 2.9285714285714288e-05, 'samples': 2016, 'steps': 41, 'loss/train': 8.789310455322266}
07/24/2024 16:23:09 - INFO - __main__ - Step 43: {'lr': 3e-05, 'samples': 2064, 'steps': 42, 'loss/train': 8.754344940185547}
07/24/2024 16:23:09 - INFO - __main__ - Step 44: {'lr': 3.071428571428572e-05, 'samples': 2112, 'steps': 43, 'loss/train': 8.84192943572998}
07/24/2024 16:23:10 - INFO - __main__ - Step 45: {'lr': 3.142857142857143e-05, 'samples': 2160, 'steps': 44, 'loss/train': 8.784793853759766}
07/24/2024 16:23:10 - INFO - __main__ - Step 46: {'lr': 3.214285714285714e-05, 'samples': 2208, 'steps': 45, 'loss/train': 8.67403793334961}
07/24/2024 16:23:10 - INFO - __main__ - Step 47: {'lr': 3.285714285714286e-05, 'samples': 2256, 'steps': 46, 'loss/train': 8.51427173614502}
07/24/2024 16:23:10 - INFO - __main__ - Step 48: {'lr': 3.357142857142857e-05, 'samples': 2304, 'steps': 47, 'loss/train': 8.48193073272705}
07/24/2024 16:23:11 - INFO - __main__ - Step 49: {'lr': 3.428571428571429e-05, 'samples': 2352, 'steps': 48, 'loss/train': 8.518038749694824}
07/24/2024 16:23:11 - INFO - __main__ - Step 50: {'lr': 3.5000000000000004e-05, 'samples': 2400, 'steps': 49, 'loss/train': 8.63569450378418}
07/24/2024 16:23:11 - INFO - __main__ - Evaluating and saving model checkpoint
07/24/2024 16:23:14 - WARNING - datasets.iterable_dataset - Too many dataloader workers: 96 (max is dataset.n_shards=1). Stopping 95 dataloader workers.
07/24/2024 16:23:14 - INFO - datasets.iterable_dataset - To parallelize data loading, we give each process some shards (or data sources) to process. Therefore it's unnecessary to have a number of workers greater than dataset.n_shards=1. To enable more parallelism, please split the dataset in more files than 1.
07/24/2024 16:23:14 - DEBUG - datasets.iterable_dataset - dataloader worker#0, ': Starting to iterate over 1/1 shards.
07/24/2024 16:23:14 - DEBUG - datasets.iterable_dataset - dataloader worker#3, ': Stopping... Number of dataset shards < num_workers (1<96).
07/24/2024 16:23:14 - DEBUG - datasets.iterable_dataset - dataloader worker#2, ': Stopping... Number of dataset shards < num_workers (1<96).
07/24/2024 16:23:14 - DEBUG - datasets.iterable_dataset - dataloader worker#1, ': Stopping... Number of dataset shards < num_workers (1<96).
07/24/2024 16:23:14 - DEBUG - datasets.iterable_dataset - dataloader worker#4, ': Stopping... Number of dataset shards < num_workers (1<96).
07/24/2024 16:23:14 - DEBUG - datasets.iterable_dataset - dataloader worker#5, ': Stopping... Number of dataset shards < num_workers (1<96).
07/24/2024 16:23:14 - DEBUG - datasets.iterable_dataset - dataloader worker#6, ': Stopping... Number of dataset shards < num_workers (1<96).
07/24/2024 16:23:14 - DEBUG - datasets.iterable_dataset - dataloader worker#7, ': Stopping... Number of dataset shards < num_workers (1<96).
07/24/2024 16:23:14 - DEBUG - datasets.iterable_dataset - dataloader worker#8, ': Stopping... Number of dataset shards < num_workers (1<96).
07/24/2024 16:23:14 - DEBUG - datasets.iterable_dataset - dataloader worker#9, ': Stopping... Number of dataset shards < num_workers (1<96).
07/24/2024 16:23:14 - DEBUG - datasets.iterable_dataset - dataloader worker#10, ': Stopping... Number of dataset shards < num_workers (1<96).
07/24/2024 16:23:14 - DEBUG - datasets.iterable_dataset - dataloader worker#11, ': Stopping... Number of dataset shards < num_workers (1<96).
07/24/2024 16:23:14 - DEBUG - datasets.iterable_dataset - dataloader worker#12, ': Stopping... Number of dataset shards < num_workers (1<96).
07/24/2024 16:23:14 - DEBUG - datasets.iterable_dataset - dataloader worker#13, ': Stopping... Number of dataset shards < num_workers (1<96).
07/24/2024 16:23:14 - DEBUG - datasets.iterable_dataset - dataloader worker#14, ': Stopping... Number of dataset shards < num_workers (1<96).
07/24/2024 16:23:14 - DEBUG - datasets.iterable_dataset - dataloader worker#15, ': Stopping... Number of dataset shards < num_workers (1<96).
07/24/2024 16:23:14 - DEBUG - datasets.iterable_dataset - dataloader worker#17, ': Stopping... Number of dataset shards < num_workers (1<96).
07/24/2024 16:23:14 - DEBUG - datasets.iterable_dataset - dataloader worker#16, ': Stopping... Number of dataset shards < num_workers (1<96).
07/24/2024 16:23:14 - DEBUG - datasets.iterable_dataset - dataloader worker#18, ': Stopping... Number of dataset shards < num_workers (1<96).
07/24/2024 16:23:14 - DEBUG - datasets.iterable_dataset - dataloader worker#19, ': Stopping... Number of dataset shards < num_workers (1<96).
07/24/2024 16:23:14 - DEBUG - datasets.iterable_dataset - dataloader worker#20, ': Stopping... Number of dataset shards < num_workers (1<96).
07/24/2024 16:23:14 - DEBUG - datasets.iterable_dataset - dataloader worker#21, ': Stopping... Number of dataset shards < num_workers (1<96).
07/24/2024 16:23:14 - DEBUG - datasets.iterable_dataset - dataloader worker#22, ': Stopping... Number of dataset shards < num_workers (1<96).
07/24/2024 16:23:14 - DEBUG - datasets.iterable_dataset - dataloader worker#23, ': Stopping... Number of dataset shards < num_workers (1<96).
07/24/2024 16:23:14 - DEBUG - datasets.iterable_dataset - dataloader worker#24, ': Stopping... Number of dataset shards < num_workers (1<96).
07/24/2024 16:23:14 - DEBUG - datasets.iterable_dataset - dataloader worker#26, ': Stopping... Number of dataset shards < num_workers (1<96).
07/24/2024 16:23:14 - DEBUG - datasets.iterable_dataset - dataloader worker#27, ': Stopping... Number of dataset shards < num_workers (1<96).
07/24/2024 16:23:14 - DEBUG - datasets.iterable_dataset - dataloader worker#28, ': Stopping... Number of dataset shards < num_workers (1<96).
07/24/2024 16:23:14 - DEBUG - datasets.iterable_dataset - dataloader worker#29, ': Stopping... Number of dataset shards < num_workers (1<96).
07/24/2024 16:23:14 - DEBUG - datasets.iterable_dataset - dataloader worker#25, ': Stopping... Number of dataset shards < num_workers (1<96).
07/24/2024 16:23:14 - DEBUG - datasets.iterable_dataset - dataloader worker#30, ': Stopping... Number of dataset shards < num_workers (1<96).
07/24/2024 16:23:14 - DEBUG - datasets.iterable_dataset - dataloader worker#31, ': Stopping... Number of dataset shards < num_workers (1<96).
07/24/2024 16:23:14 - DEBUG - datasets.iterable_dataset - dataloader worker#32, ': Stopping... Number of dataset shards < num_workers (1<96).
07/24/2024 16:23:14 - DEBUG - datasets.iterable_dataset - dataloader worker#33, ': Stopping... Number of dataset shards < num_workers (1<96).
07/24/2024 16:23:14 - DEBUG - datasets.iterable_dataset - dataloader worker#34, ': Stopping... Number of dataset shards < num_workers (1<96).
07/24/2024 16:23:14 - DEBUG - datasets.iterable_dataset - dataloader worker#35, ': Stopping... Number of dataset shards < num_workers (1<96).
07/24/2024 16:23:14 - DEBUG - datasets.iterable_dataset - dataloader worker#36, ': Stopping... Number of dataset shards < num_workers (1<96).
07/24/2024 16:23:14 - DEBUG - datasets.iterable_dataset - dataloader worker#37, ': Stopping... Number of dataset shards < num_workers (1<96).
07/24/2024 16:23:14 - DEBUG - datasets.iterable_dataset - dataloader worker#38, ': Stopping... Number of dataset shards < num_workers (1<96).
07/24/2024 16:23:14 - DEBUG - datasets.iterable_dataset - dataloader worker#39, ': Stopping... Number of dataset shards < num_workers (1<96).
07/24/2024 16:23:14 - DEBUG - datasets.iterable_dataset - dataloader worker#40, ': Stopping... Number of dataset shards < num_workers (1<96).
07/24/2024 16:23:14 - DEBUG - datasets.iterable_dataset - dataloader worker#41, ': Stopping... Number of dataset shards < num_workers (1<96).
07/24/2024 16:23:14 - DEBUG - datasets.iterable_dataset - dataloader worker#42, ': Stopping... Number of dataset shards < num_workers (1<96).
07/24/2024 16:23:14 - DEBUG - datasets.iterable_dataset - dataloader worker#43, ': Stopping... Number of dataset shards < num_workers (1<96).
07/24/2024 16:23:14 - DEBUG - datasets.iterable_dataset - dataloader worker#44, ': Stopping... Number of dataset shards < num_workers (1<96).
07/24/2024 16:23:14 - DEBUG - datasets.iterable_dataset - dataloader worker#45, ': Stopping... Number of dataset shards < num_workers (1<96).
07/24/2024 16:23:14 - DEBUG - datasets.iterable_dataset - dataloader worker#46, ': Stopping... Number of dataset shards < num_workers (1<96).
07/24/2024 16:23:14 - DEBUG - datasets.iterable_dataset - dataloader worker#47, ': Stopping... Number of dataset shards < num_workers (1<96).
07/24/2024 16:23:14 - DEBUG - datasets.iterable_dataset - dataloader worker#48, ': Stopping... Number of dataset shards < num_workers (1<96).
07/24/2024 16:23:14 - DEBUG - datasets.iterable_dataset - dataloader worker#49, ': Stopping... Number of dataset shards < num_workers (1<96).
07/24/2024 16:23:14 - DEBUG - datasets.iterable_dataset - dataloader worker#50, ': Stopping... Number of dataset shards < num_workers (1<96).
07/24/2024 16:23:14 - DEBUG - datasets.iterable_dataset - dataloader worker#51, ': Stopping... Number of dataset shards < num_workers (1<96).
07/24/2024 16:23:14 - DEBUG - datasets.iterable_dataset - dataloader worker#52, ': Stopping... Number of dataset shards < num_workers (1<96).
07/24/2024 16:23:14 - DEBUG - datasets.iterable_dataset - dataloader worker#53, ': Stopping... Number of dataset shards < num_workers (1<96).
07/24/2024 16:23:14 - DEBUG - datasets.iterable_dataset - dataloader worker#54, ': Stopping... Number of dataset shards < num_workers (1<96).
07/24/2024 16:23:14 - DEBUG - datasets.iterable_dataset - dataloader worker#55, ': Stopping... Number of dataset shards < num_workers (1<96).
07/24/2024 16:23:14 - DEBUG - datasets.iterable_dataset - dataloader worker#56, ': Stopping... Number of dataset shards < num_workers (1<96).
07/24/2024 16:23:14 - DEBUG - datasets.iterable_dataset - dataloader worker#57, ': Stopping... Number of dataset shards < num_workers (1<96).
07/24/2024 16:23:14 - DEBUG - datasets.iterable_dataset - dataloader worker#58, ': Stopping... Number of dataset shards < num_workers (1<96).
07/24/2024 16:23:14 - DEBUG - datasets.iterable_dataset - dataloader worker#59, ': Stopping... Number of dataset shards < num_workers (1<96).
07/24/2024 16:23:14 - DEBUG - datasets.iterable_dataset - dataloader worker#60, ': Stopping... Number of dataset shards < num_workers (1<96).
07/24/2024 16:23:14 - DEBUG - datasets.iterable_dataset - dataloader worker#61, ': Stopping... Number of dataset shards < num_workers (1<96).
07/24/2024 16:23:14 - DEBUG - datasets.iterable_dataset - dataloader worker#62, ': Stopping... Number of dataset shards < num_workers (1<96).
07/24/2024 16:23:14 - DEBUG - datasets.iterable_dataset - dataloader worker#63, ': Stopping... Number of dataset shards < num_workers (1<96).
07/24/2024 16:23:14 - DEBUG - datasets.iterable_dataset - dataloader worker#64, ': Stopping... Number of dataset shards < num_workers (1<96).
07/24/2024 16:23:14 - DEBUG - datasets.iterable_dataset - dataloader worker#66, ': Stopping... Number of dataset shards < num_workers (1<96).
07/24/2024 16:23:14 - DEBUG - datasets.iterable_dataset - dataloader worker#65, ': Stopping... Number of dataset shards < num_workers (1<96).
07/24/2024 16:23:14 - DEBUG - datasets.iterable_dataset - dataloader worker#67, ': Stopping... Number of dataset shards < num_workers (1<96).
07/24/2024 16:23:14 - DEBUG - datasets.iterable_dataset - dataloader worker#68, ': Stopping... Number of dataset shards < num_workers (1<96).
07/24/2024 16:23:14 - DEBUG - datasets.iterable_dataset - dataloader worker#69, ': Stopping... Number of dataset shards < num_workers (1<96).
07/24/2024 16:23:14 - DEBUG - datasets.iterable_dataset - dataloader worker#70, ': Stopping... Number of dataset shards < num_workers (1<96).
07/24/2024 16:23:14 - DEBUG - datasets.iterable_dataset - dataloader worker#71, ': Stopping... Number of dataset shards < num_workers (1<96).
07/24/2024 16:23:14 - DEBUG - datasets.iterable_dataset - dataloader worker#72, ': Stopping... Number of dataset shards < num_workers (1<96).
07/24/2024 16:23:14 - DEBUG - datasets.iterable_dataset - dataloader worker#73, ': Stopping... Number of dataset shards < num_workers (1<96).
07/24/2024 16:23:14 - DEBUG - datasets.iterable_dataset - dataloader worker#74, ': Stopping... Number of dataset shards < num_workers (1<96).
07/24/2024 16:23:14 - DEBUG - datasets.iterable_dataset - dataloader worker#75, ': Stopping... Number of dataset shards < num_workers (1<96).
07/24/2024 16:23:14 - DEBUG - datasets.iterable_dataset - dataloader worker#76, ': Stopping... Number of dataset shards < num_workers (1<96).
07/24/2024 16:23:14 - DEBUG - datasets.iterable_dataset - dataloader worker#77, ': Stopping... Number of dataset shards < num_workers (1<96).
07/24/2024 16:23:14 - DEBUG - datasets.iterable_dataset - dataloader worker#78, ': Stopping... Number of dataset shards < num_workers (1<96).
07/24/2024 16:23:14 - DEBUG - datasets.iterable_dataset - dataloader worker#80, ': Stopping... Number of dataset shards < num_workers (1<96).
07/24/2024 16:23:14 - DEBUG - datasets.iterable_dataset - dataloader worker#79, ': Stopping... Number of dataset shards < num_workers (1<96).
07/24/2024 16:23:14 - DEBUG - datasets.iterable_dataset - dataloader worker#81, ': Stopping... Number of dataset shards < num_workers (1<96).
07/24/2024 16:23:14 - DEBUG - datasets.iterable_dataset - dataloader worker#82, ': Stopping... Number of dataset shards < num_workers (1<96).
07/24/2024 16:23:14 - DEBUG - datasets.iterable_dataset - dataloader worker#83, ': Stopping... Number of dataset shards < num_workers (1<96).
07/24/2024 16:23:14 - DEBUG - datasets.iterable_dataset - dataloader worker#84, ': Stopping... Number of dataset shards < num_workers (1<96).
07/24/2024 16:23:14 - DEBUG - datasets.iterable_dataset - dataloader worker#85, ': Stopping... Number of dataset shards < num_workers (1<96).
07/24/2024 16:23:14 - DEBUG - datasets.iterable_dataset - dataloader worker#86, ': Stopping... Number of dataset shards < num_workers (1<96).
07/24/2024 16:23:14 - DEBUG - datasets.iterable_dataset - dataloader worker#87, ': Stopping... Number of dataset shards < num_workers (1<96).
07/24/2024 16:23:14 - DEBUG - datasets.iterable_dataset - dataloader worker#88, ': Stopping... Number of dataset shards < num_workers (1<96).
07/24/2024 16:23:14 - DEBUG - datasets.iterable_dataset - dataloader worker#89, ': Stopping... Number of dataset shards < num_workers (1<96).
07/24/2024 16:23:14 - DEBUG - datasets.iterable_dataset - dataloader worker#90, ': Stopping... Number of dataset shards < num_workers (1<96).
07/24/2024 16:23:14 - DEBUG - datasets.iterable_dataset - dataloader worker#91, ': Stopping... Number of dataset shards < num_workers (1<96).
07/24/2024 16:23:14 - DEBUG - datasets.iterable_dataset - dataloader worker#92, ': Stopping... Number of dataset shards < num_workers (1<96).
07/24/2024 16:23:14 - DEBUG - datasets.iterable_dataset - dataloader worker#93, ': Stopping... Number of dataset shards < num_workers (1<96).
07/24/2024 16:23:14 - DEBUG - datasets.iterable_dataset - dataloader worker#94, ': Stopping... Number of dataset shards < num_workers (1<96).
07/24/2024 16:23:14 - DEBUG - datasets.iterable_dataset - dataloader worker#95, ': Stopping... Number of dataset shards < num_workers (1<96).
07/24/2024 16:24:38 - INFO - __main__ - Distributed environment: MULTI_GPU Backend: nccl
Num processes: 4
Process index: 0
Local process index: 0
Device: cuda:0
Mixed precision type: fp16
07/24/2024 16:24:39 - WARNING - huggingface_hub.repository - /dli/gptesla-small/./ is already a clone of https://huggingface.co/shng2025/gptesla-small. Make sure you pull the latest changes with `repo.git_pull()`.
07/24/2024 16:24:39 - WARNING - huggingface_hub.repository - Revision `faithful-thunder-118` does not exist. Created and checked out branch `faithful-thunder-118`.
07/24/2024 16:24:39 - WARNING - huggingface_hub.repository -
07/24/2024 16:24:40 - DEBUG - datasets.utils._dataset_viewer - Dataset info for shng2025/gptesla-train is not completely ready yet.
07/24/2024 16:24:40 - INFO - datasets.builder - No config specified, defaulting to the single config: gptesla-train/default
07/24/2024 16:24:40 - INFO - datasets.info - Loading Dataset Infos from /usr/local/lib/python3.10/dist-packages/datasets/packaged_modules/json
07/24/2024 16:24:45 - DEBUG - datasets.iterable_dataset - dataloader worker#0, ': Starting to iterate over 2/183 shards.
07/24/2024 16:24:46 - DEBUG - datasets.iterable_dataset - dataloader worker#5, ': Starting to iterate over 2/183 shards.
07/24/2024 16:24:46 - DEBUG - datasets.iterable_dataset - dataloader worker#2, ': Starting to iterate over 2/183 shards.
07/24/2024 16:24:46 - DEBUG - datasets.iterable_dataset - dataloader worker#6, ': Starting to iterate over 2/183 shards.
07/24/2024 16:24:46 - DEBUG - datasets.iterable_dataset - dataloader worker#9, ': Starting to iterate over 2/183 shards.
07/24/2024 16:24:46 - DEBUG - datasets.iterable_dataset - dataloader worker#10, ': Starting to iterate over 2/183 shards.
07/24/2024 16:24:46 - DEBUG - datasets.iterable_dataset - dataloader worker#7, ': Starting to iterate over 2/183 shards.
07/24/2024 16:24:46 - DEBUG - datasets.iterable_dataset - dataloader worker#18, ': Starting to iterate over 2/183 shards.
07/24/2024 16:24:46 - DEBUG - datasets.iterable_dataset - dataloader worker#4, ': Starting to iterate over 2/183 shards.
07/24/2024 16:24:46 - DEBUG - datasets.iterable_dataset - dataloader worker#23, ': Starting to iterate over 2/183 shards.
07/24/2024 16:24:46 - DEBUG - datasets.iterable_dataset - dataloader worker#1, ': Starting to iterate over 2/183 shards.
07/24/2024 16:24:46 - DEBUG - datasets.iterable_dataset - dataloader worker#21, ': Starting to iterate over 2/183 shards.
07/24/2024 16:24:46 - DEBUG - datasets.iterable_dataset - dataloader worker#27, ': Starting to iterate over 2/183 shards.
07/24/2024 16:24:46 - DEBUG - datasets.iterable_dataset - dataloader worker#8, ': Starting to iterate over 2/183 shards.
07/24/2024 16:24:46 - DEBUG - datasets.iterable_dataset - dataloader worker#30, ': Starting to iterate over 2/183 shards.
07/24/2024 16:24:46 - DEBUG - datasets.iterable_dataset - dataloader worker#12, ': Starting to iterate over 2/183 shards.
07/24/2024 16:24:46 - DEBUG - datasets.iterable_dataset - dataloader worker#33, ': Starting to iterate over 2/183 shards.
07/24/2024 16:24:46 - DEBUG - datasets.iterable_dataset - dataloader worker#24, ': Starting to iterate over 2/183 shards.
07/24/2024 16:24:46 - DEBUG - datasets.iterable_dataset - dataloader worker#31, ': Starting to iterate over 2/183 shards.
07/24/2024 16:24:46 - DEBUG - datasets.iterable_dataset - dataloader worker#32, ': Starting to iterate over 2/183 shards.
07/24/2024 16:24:46 - DEBUG - datasets.iterable_dataset - dataloader worker#19, ': Starting to iterate over 2/183 shards.
07/24/2024 16:24:46 - DEBUG - datasets.iterable_dataset - dataloader worker#39, ': Starting to iterate over 2/183 shards.
07/24/2024 16:24:46 - DEBUG - datasets.iterable_dataset - dataloader worker#11, ': Starting to iterate over 2/183 shards.
07/24/2024 16:24:46 - DEBUG - datasets.iterable_dataset - dataloader worker#20, ': Starting to iterate over 2/183 shards.
07/24/2024 16:24:46 - DEBUG - datasets.iterable_dataset - dataloader worker#22, ': Starting to iterate over 2/183 shards.
07/24/2024 16:24:46 - DEBUG - datasets.iterable_dataset - dataloader worker#25, ': Starting to iterate over 2/183 shards.
07/24/2024 16:24:46 - DEBUG - datasets.iterable_dataset - dataloader worker#14, ': Starting to iterate over 2/183 shards.
07/24/2024 16:24:46 - DEBUG - datasets.iterable_dataset - dataloader worker#40, ': Starting to iterate over 2/183 shards.
07/24/2024 16:24:46 - DEBUG - datasets.iterable_dataset - dataloader worker#15, ': Starting to iterate over 2/183 shards.
07/24/2024 16:24:46 - DEBUG - datasets.iterable_dataset - dataloader worker#13, ': Starting to iterate over 2/183 shards.
07/24/2024 16:24:46 - DEBUG - datasets.iterable_dataset - dataloader worker#16, ': Starting to iterate over 2/183 shards.
07/24/2024 16:24:46 - DEBUG - datasets.iterable_dataset - dataloader worker#41, ': Starting to iterate over 2/183 shards.
07/24/2024 16:24:46 - DEBUG - datasets.iterable_dataset - dataloader worker#17, ': Starting to iterate over 2/183 shards.
07/24/2024 16:24:46 - DEBUG - datasets.iterable_dataset - dataloader worker#29, ': Starting to iterate over 2/183 shards.
07/24/2024 16:24:46 - DEBUG - datasets.iterable_dataset - dataloader worker#28, ': Starting to iterate over 2/183 shards.
07/24/2024 16:24:46 - DEBUG - datasets.iterable_dataset - dataloader worker#37, ': Starting to iterate over 2/183 shards.
07/24/2024 16:24:46 - DEBUG - datasets.iterable_dataset - dataloader worker#36, ': Starting to iterate over 2/183 shards.
07/24/2024 16:24:46 - DEBUG - datasets.iterable_dataset - dataloader worker#35, ': Starting to iterate over 2/183 shards.
07/24/2024 16:24:46 - DEBUG - datasets.iterable_dataset - dataloader worker#3, ': Starting to iterate over 2/183 shards.
07/24/2024 16:24:46 - DEBUG - datasets.iterable_dataset - dataloader worker#26, ': Starting to iterate over 2/183 shards.
07/24/2024 16:24:46 - DEBUG - datasets.iterable_dataset - dataloader worker#38, ': Starting to iterate over 2/183 shards.
07/24/2024 16:24:46 - DEBUG - datasets.iterable_dataset - dataloader worker#34, ': Starting to iterate over 2/183 shards.
07/24/2024 16:24:46 - DEBUG - datasets.iterable_dataset - dataloader worker#42, ': Starting to iterate over 2/183 shards.
07/24/2024 16:24:46 - DEBUG - datasets.iterable_dataset - dataloader worker#43, ': Starting to iterate over 2/183 shards.
07/24/2024 16:24:46 - DEBUG - datasets.iterable_dataset - dataloader worker#45, ': Starting to iterate over 2/183 shards.
07/24/2024 16:24:46 - DEBUG - datasets.iterable_dataset - dataloader worker#46, ': Starting to iterate over 2/183 shards.
07/24/2024 16:24:46 - DEBUG - datasets.iterable_dataset - dataloader worker#44, ': Starting to iterate over 2/183 shards.
07/24/2024 16:24:46 - DEBUG - datasets.iterable_dataset - dataloader worker#47, ': Starting to iterate over 2/183 shards.
07/24/2024 16:24:46 - DEBUG - datasets.iterable_dataset - dataloader worker#48, ': Starting to iterate over 2/183 shards.
07/24/2024 16:24:46 - DEBUG - datasets.iterable_dataset - dataloader worker#49, ': Starting to iterate over 2/183 shards.
07/24/2024 16:24:46 - DEBUG - datasets.iterable_dataset - dataloader worker#50, ': Starting to iterate over 2/183 shards.
07/24/2024 16:24:46 - DEBUG - datasets.iterable_dataset - dataloader worker#52, ': Starting to iterate over 2/183 shards.
07/24/2024 16:24:46 - DEBUG - datasets.iterable_dataset - dataloader worker#51, ': Starting to iterate over 2/183 shards.
07/24/2024 16:24:46 - DEBUG - datasets.iterable_dataset - dataloader worker#53, ': Starting to iterate over 2/183 shards.
07/24/2024 16:24:46 - DEBUG - datasets.iterable_dataset - dataloader worker#54, ': Starting to iterate over 2/183 shards.
07/24/2024 16:24:46 - DEBUG - datasets.iterable_dataset - dataloader worker#55, ': Starting to iterate over 2/183 shards.
07/24/2024 16:24:46 - DEBUG - datasets.iterable_dataset - dataloader worker#56, ': Starting to iterate over 2/183 shards.
07/24/2024 16:24:46 - DEBUG - datasets.iterable_dataset - dataloader worker#57, ': Starting to iterate over 2/183 shards.
07/24/2024 16:24:46 - DEBUG - datasets.iterable_dataset - dataloader worker#58, ': Starting to iterate over 2/183 shards.
07/24/2024 16:24:46 - DEBUG - datasets.iterable_dataset - dataloader worker#59, ': Starting to iterate over 2/183 shards.
07/24/2024 16:24:46 - DEBUG - datasets.iterable_dataset - dataloader worker#60, ': Starting to iterate over 2/183 shards.
07/24/2024 16:24:46 - DEBUG - datasets.iterable_dataset - dataloader worker#61, ': Starting to iterate over 2/183 shards.
07/24/2024 16:24:46 - DEBUG - datasets.iterable_dataset - dataloader worker#62, ': Starting to iterate over 2/183 shards.
07/24/2024 16:24:46 - DEBUG - datasets.iterable_dataset - dataloader worker#64, ': Starting to iterate over 2/183 shards.
07/24/2024 16:24:46 - DEBUG - datasets.iterable_dataset - dataloader worker#63, ': Starting to iterate over 2/183 shards.
07/24/2024 16:24:46 - DEBUG - datasets.iterable_dataset - dataloader worker#65, ': Starting to iterate over 2/183 shards.
07/24/2024 16:24:46 - DEBUG - datasets.iterable_dataset - dataloader worker#66, ': Starting to iterate over 2/183 shards.
07/24/2024 16:24:46 - DEBUG - datasets.iterable_dataset - dataloader worker#67, ': Starting to iterate over 2/183 shards.
07/24/2024 16:24:46 - DEBUG - datasets.iterable_dataset - dataloader worker#68, ': Starting to iterate over 2/183 shards.
07/24/2024 16:24:46 - DEBUG - datasets.iterable_dataset - dataloader worker#69, ': Starting to iterate over 2/183 shards.
07/24/2024 16:24:46 - DEBUG - datasets.iterable_dataset - dataloader worker#70, ': Starting to iterate over 2/183 shards.
07/24/2024 16:24:46 - DEBUG - datasets.iterable_dataset - dataloader worker#72, ': Starting to iterate over 2/183 shards.
07/24/2024 16:24:46 - DEBUG - datasets.iterable_dataset - dataloader worker#71, ': Starting to iterate over 2/183 shards.
07/24/2024 16:24:46 - DEBUG - datasets.iterable_dataset - dataloader worker#73, ': Starting to iterate over 2/183 shards.
07/24/2024 16:24:46 - DEBUG - datasets.iterable_dataset - dataloader worker#74, ': Starting to iterate over 2/183 shards.
07/24/2024 16:24:46 - DEBUG - datasets.iterable_dataset - dataloader worker#75, ': Starting to iterate over 2/183 shards.
07/24/2024 16:24:46 - DEBUG - datasets.iterable_dataset - dataloader worker#76, ': Starting to iterate over 2/183 shards.
07/24/2024 16:24:46 - DEBUG - datasets.iterable_dataset - dataloader worker#77, ': Starting to iterate over 2/183 shards.
07/24/2024 16:24:46 - DEBUG - datasets.iterable_dataset - dataloader worker#78, ': Starting to iterate over 2/183 shards.
07/24/2024 16:24:46 - DEBUG - datasets.iterable_dataset - dataloader worker#80, ': Starting to iterate over 2/183 shards.
07/24/2024 16:24:46 - DEBUG - datasets.iterable_dataset - dataloader worker#79, ': Starting to iterate over 2/183 shards.
07/24/2024 16:24:46 - DEBUG - datasets.iterable_dataset - dataloader worker#81, ': Starting to iterate over 2/183 shards.
07/24/2024 16:24:46 - DEBUG - datasets.iterable_dataset - dataloader worker#82, ': Starting to iterate over 2/183 shards.
07/24/2024 16:24:46 - DEBUG - datasets.iterable_dataset - dataloader worker#83, ': Starting to iterate over 2/183 shards.
07/24/2024 16:24:46 - DEBUG - datasets.iterable_dataset - dataloader worker#84, ': Starting to iterate over 2/183 shards.
07/24/2024 16:24:46 - DEBUG - datasets.iterable_dataset - dataloader worker#85, ': Starting to iterate over 2/183 shards.
07/24/2024 16:24:46 - DEBUG - datasets.iterable_dataset - dataloader worker#86, ': Starting to iterate over 2/183 shards.
07/24/2024 16:24:46 - DEBUG - datasets.iterable_dataset - dataloader worker#87, ': Starting to iterate over 1/183 shards.
07/24/2024 16:24:46 - DEBUG - datasets.iterable_dataset - dataloader worker#88, ': Starting to iterate over 1/183 shards.
07/24/2024 16:24:46 - DEBUG - datasets.iterable_dataset - dataloader worker#89, ': Starting to iterate over 1/183 shards.
07/24/2024 16:24:46 - DEBUG - datasets.iterable_dataset - dataloader worker#90, ': Starting to iterate over 1/183 shards.
07/24/2024 16:24:46 - DEBUG - datasets.iterable_dataset - dataloader worker#91, ': Starting to iterate over 1/183 shards.
07/24/2024 16:24:46 - DEBUG - datasets.iterable_dataset - dataloader worker#92, ': Starting to iterate over 1/183 shards.
07/24/2024 16:24:46 - DEBUG - datasets.iterable_dataset - dataloader worker#93, ': Starting to iterate over 1/183 shards.
07/24/2024 16:24:46 - DEBUG - datasets.iterable_dataset - dataloader worker#94, ': Starting to iterate over 1/183 shards.
07/24/2024 16:24:46 - DEBUG - datasets.iterable_dataset - dataloader worker#95, ': Starting to iterate over 1/183 shards.
07/24/2024 16:24:46 - DEBUG - datasets.packaged_modules.json.json - Batch of 10486616 bytes couldn't be parsed with block_size=327680. Retrying with block_size=655360.
07/24/2024 16:24:46 - DEBUG - datasets.packaged_modules.json.json - Batch of 10486023 bytes couldn't be parsed with block_size=327680. Retrying with block_size=655360.
07/24/2024 16:24:46 - DEBUG - datasets.packaged_modules.json.json - Batch of 10500930 bytes couldn't be parsed with block_size=327680. Retrying with block_size=655360.
07/24/2024 16:24:46 - DEBUG - datasets.packaged_modules.json.json - Batch of 10497111 bytes couldn't be parsed with block_size=327680. Retrying with block_size=655360.
07/24/2024 16:24:46 - DEBUG - datasets.packaged_modules.json.json - Batch of 10497062 bytes couldn't be parsed with block_size=327680. Retrying with block_size=655360.
07/24/2024 16:24:46 - DEBUG - datasets.packaged_modules.json.json - Batch of 10499607 bytes couldn't be parsed with block_size=327680. Retrying with block_size=655360.
07/24/2024 16:24:46 - DEBUG - datasets.packaged_modules.json.json - Batch of 10486397 bytes couldn't be parsed with block_size=327680. Retrying with block_size=655360.
07/24/2024 16:24:46 - DEBUG - datasets.packaged_modules.json.json - Batch of 10511500 bytes couldn't be parsed with block_size=327680. Retrying with block_size=655360.
07/24/2024 16:24:46 - DEBUG - datasets.packaged_modules.json.json - Batch of 10525688 bytes couldn't be parsed with block_size=327680. Retrying with block_size=655360.
07/24/2024 16:24:46 - DEBUG - datasets.packaged_modules.json.json - Batch of 10486616 bytes couldn't be parsed with block_size=655360. Retrying with block_size=1310720.
07/24/2024 16:24:46 - DEBUG - datasets.packaged_modules.json.json - Batch of 10522596 bytes couldn't be parsed with block_size=327680. Retrying with block_size=655360.
07/24/2024 16:24:46 - DEBUG - datasets.packaged_modules.json.json - Batch of 10489599 bytes couldn't be parsed with block_size=327680. Retrying with block_size=655360.
07/24/2024 16:24:46 - DEBUG - datasets.packaged_modules.json.json - Batch of 10512203 bytes couldn't be parsed with block_size=327680. Retrying with block_size=655360.
07/24/2024 16:24:46 - DEBUG - datasets.packaged_modules.json.json - Batch of 10497218 bytes couldn't be parsed with block_size=327680. Retrying with block_size=655360.
07/24/2024 16:24:46 - DEBUG - datasets.packaged_modules.json.json - Batch of 10949076 bytes couldn't be parsed with block_size=327680. Retrying with block_size=655360.
07/24/2024 16:24:46 - DEBUG - datasets.packaged_modules.json.json - Batch of 10497111 bytes couldn't be parsed with block_size=655360. Retrying with block_size=1310720.
07/24/2024 16:24:46 - DEBUG - datasets.packaged_modules.json.json - Batch of 10486023 bytes couldn't be parsed with block_size=655360. Retrying with block_size=1310720.
07/24/2024 16:24:46 - DEBUG - datasets.packaged_modules.json.json - Batch of 10562022 bytes couldn't be parsed with block_size=327680. Retrying with block_size=655360.
07/24/2024 16:24:46 - DEBUG - datasets.packaged_modules.json.json - Batch of 10491327 bytes couldn't be parsed with block_size=327680. Retrying with block_size=655360.
07/24/2024 16:24:46 - DEBUG - datasets.packaged_modules.json.json - Batch of 10488098 bytes couldn't be parsed with block_size=327680. Retrying with block_size=655360.
07/24/2024 16:24:46 - DEBUG - datasets.packaged_modules.json.json - Batch of 10489575 bytes couldn't be parsed with block_size=327680. Retrying with block_size=655360.
07/24/2024 16:24:46 - DEBUG - datasets.packaged_modules.json.json - Batch of 10668116 bytes couldn't be parsed with block_size=327680. Retrying with block_size=655360.
07/24/2024 16:24:46 - DEBUG - datasets.packaged_modules.json.json - Batch of 10553677 bytes couldn't be parsed with block_size=327680. Retrying with block_size=655360.
07/24/2024 16:24:46 - DEBUG - datasets.packaged_modules.json.json - Batch of 10553677 bytes couldn't be parsed with block_size=655360. Retrying with block_size=1310720.
07/24/2024 16:24:46 - DEBUG - datasets.packaged_modules.json.json - Batch of 10485847 bytes couldn't be parsed with block_size=327680. Retrying with block_size=655360.
07/24/2024 16:24:46 - DEBUG - datasets.packaged_modules.json.json - Batch of 10511604 bytes couldn't be parsed with block_size=655360. Retrying with block_size=1310720.
07/24/2024 16:24:46 - DEBUG - datasets.packaged_modules.json.json - Batch of 10487790 bytes couldn't be parsed with block_size=327680. Retrying with block_size=655360.
07/24/2024 16:24:46 - DEBUG - datasets.packaged_modules.json.json - Batch of 10489635 bytes couldn't be parsed with block_size=327680. Retrying with block_size=655360.
07/24/2024 16:24:46 - DEBUG - datasets.packaged_modules.json.json - Batch of 10485912 bytes couldn't be parsed with block_size=327680. Retrying with block_size=655360.
07/24/2024 16:24:46 - DEBUG - datasets.packaged_modules.json.json - Batch of 10552417 bytes couldn't be parsed with block_size=327680. Retrying with block_size=655360.
07/24/2024 16:24:46 - DEBUG - datasets.packaged_modules.json.json - Batch of 10488651 bytes couldn't be parsed with block_size=327680. Retrying with block_size=655360.
07/24/2024 16:24:46 - DEBUG - datasets.packaged_modules.json.json - Batch of 10485912 bytes couldn't be parsed with block_size=655360. Retrying with block_size=1310720.
07/24/2024 16:24:46 - DEBUG - datasets.packaged_modules.json.json - Batch of 10491272 bytes couldn't be parsed with block_size=327680. Retrying with block_size=655360.
07/24/2024 16:24:46 - DEBUG - datasets.packaged_modules.json.json - Batch of 10488651 bytes couldn't be parsed with block_size=655360. Retrying with block_size=1310720.
07/24/2024 16:24:46 - DEBUG - datasets.packaged_modules.json.json - Batch of 10552417 bytes couldn't be parsed with block_size=655360. Retrying with block_size=1310720.
07/24/2024 16:24:46 - DEBUG - datasets.packaged_modules.json.json - Batch of 10487097 bytes couldn't be parsed with block_size=327680. Retrying with block_size=655360.
07/24/2024 16:24:46 - DEBUG - datasets.packaged_modules.json.json - Batch of 10509262 bytes couldn't be parsed with block_size=655360. Retrying with block_size=1310720.
07/24/2024 16:24:46 - DEBUG - datasets.packaged_modules.json.json - Batch of 10515063 bytes couldn't be parsed with block_size=327680. Retrying with block_size=655360.
07/24/2024 16:24:46 - DEBUG - datasets.packaged_modules.json.json - Batch of 10598254 bytes couldn't be parsed with block_size=327680. Retrying with block_size=655360.
07/24/2024 16:24:46 - DEBUG - datasets.packaged_modules.json.json - Batch of 10621496 bytes couldn't be parsed with block_size=327680. Retrying with block_size=655360.
07/24/2024 16:24:46 - DEBUG - datasets.packaged_modules.json.json - Batch of 10501535 bytes couldn't be parsed with block_size=327680. Retrying with block_size=655360.
07/24/2024 16:24:46 - DEBUG - datasets.packaged_modules.json.json - Batch of 10488150 bytes couldn't be parsed with block_size=327680. Retrying with block_size=655360.
07/24/2024 16:24:47 - DEBUG - datasets.packaged_modules.json.json - Batch of 10495973 bytes couldn't be parsed with block_size=327680. Retrying with block_size=655360.
07/24/2024 16:24:47 - DEBUG - datasets.packaged_modules.json.json - Batch of 10492554 bytes couldn't be parsed with block_size=327680. Retrying with block_size=655360.
07/24/2024 16:24:47 - DEBUG - datasets.packaged_modules.json.json - Batch of 10487725 bytes couldn't be parsed with block_size=327680. Retrying with block_size=655360.
07/24/2024 16:24:47 - DEBUG - datasets.packaged_modules.json.json - Batch of 10485918 bytes couldn't be parsed with block_size=327680. Retrying with block_size=655360.
07/24/2024 16:24:47 - DEBUG - datasets.packaged_modules.json.json - Batch of 10499106 bytes couldn't be parsed with block_size=327680. Retrying with block_size=655360.
07/24/2024 16:24:47 - DEBUG - datasets.packaged_modules.json.json - Batch of 10492277 bytes couldn't be parsed with block_size=327680. Retrying with block_size=655360.
07/24/2024 16:24:47 - DEBUG - datasets.packaged_modules.json.json - Batch of 10486276 bytes couldn't be parsed with block_size=327680. Retrying with block_size=655360.
07/24/2024 16:24:47 - DEBUG - datasets.packaged_modules.json.json - Batch of 10491889 bytes couldn't be parsed with block_size=655360. Retrying with block_size=1310720.
07/24/2024 16:24:47 - DEBUG - datasets.packaged_modules.json.json - Batch of 11286262 bytes couldn't be parsed with block_size=327680. Retrying with block_size=655360.
07/24/2024 16:24:47 - DEBUG - datasets.packaged_modules.json.json - Batch of 10486801 bytes couldn't be parsed with block_size=327680. Retrying with block_size=655360.
07/24/2024 16:24:47 - DEBUG - datasets.packaged_modules.json.json - Batch of 11286262 bytes couldn't be parsed with block_size=655360. Retrying with block_size=1310720.
07/24/2024 16:24:47 - DEBUG - datasets.packaged_modules.json.json - Batch of 10509286 bytes couldn't be parsed with block_size=327680. Retrying with block_size=655360.
07/24/2024 16:24:47 - DEBUG - datasets.packaged_modules.json.json - Batch of 10488385 bytes couldn't be parsed with block_size=327680. Retrying with block_size=655360.
07/24/2024 16:24:47 - DEBUG - datasets.packaged_modules.json.json - Batch of 10509286 bytes couldn't be parsed with block_size=655360. Retrying with block_size=1310720.
07/24/2024 16:24:47 - DEBUG - datasets.packaged_modules.json.json - Batch of 10536479 bytes couldn't be parsed with block_size=327680. Retrying with block_size=655360.
07/24/2024 16:24:47 - DEBUG - datasets.packaged_modules.json.json - Batch of 10488608 bytes couldn't be parsed with block_size=327680. Retrying with block_size=655360.
07/24/2024 16:24:47 - DEBUG - datasets.packaged_modules.json.json - Batch of 10485842 bytes couldn't be parsed with block_size=327680. Retrying with block_size=655360.
07/24/2024 16:24:47 - DEBUG - datasets.packaged_modules.json.json - Batch of 10500290 bytes couldn't be parsed with block_size=327680. Retrying with block_size=655360.
07/24/2024 16:24:47 - DEBUG - datasets.packaged_modules.json.json - Batch of 10492861 bytes couldn't be parsed with block_size=327680. Retrying with block_size=655360.
07/24/2024 16:24:47 - DEBUG - datasets.packaged_modules.json.json - Batch of 10492861 bytes couldn't be parsed with block_size=655360. Retrying with block_size=1310720.
07/24/2024 16:24:47 - DEBUG - datasets.packaged_modules.json.json - Batch of 10511515 bytes couldn't be parsed with block_size=327680. Retrying with block_size=655360.
07/24/2024 16:24:47 - DEBUG - datasets.packaged_modules.json.json - Batch of 10491547 bytes couldn't be parsed with block_size=327680. Retrying with block_size=655360.
07/24/2024 16:24:47 - DEBUG - datasets.packaged_modules.json.json - Batch of 10676628 bytes couldn't be parsed with block_size=327680. Retrying with block_size=655360.
07/24/2024 16:24:47 - DEBUG - datasets.packaged_modules.json.json - Batch of 10497335 bytes couldn't be parsed with block_size=327680. Retrying with block_size=655360.
07/24/2024 16:24:47 - DEBUG - datasets.packaged_modules.json.json - Batch of 10610581 bytes couldn't be parsed with block_size=327680. Retrying with block_size=655360.
07/24/2024 16:24:47 - DEBUG - datasets.packaged_modules.json.json - Batch of 10486172 bytes couldn't be parsed with block_size=327680. Retrying with block_size=655360.
07/24/2024 16:24:47 - DEBUG - datasets.packaged_modules.json.json - Batch of 10495520 bytes couldn't be parsed with block_size=327680. Retrying with block_size=655360.
07/24/2024 16:24:47 - DEBUG - datasets.packaged_modules.json.json - Batch of 10495520 bytes couldn't be parsed with block_size=655360. Retrying with block_size=1310720.
07/24/2024 16:24:47 - DEBUG - datasets.packaged_modules.json.json - Batch of 10487482 bytes couldn't be parsed with block_size=327680. Retrying with block_size=655360.
07/24/2024 16:24:47 - DEBUG - datasets.packaged_modules.json.json - Batch of 10686322 bytes couldn't be parsed with block_size=327680. Retrying with block_size=655360.
07/24/2024 16:24:47 - DEBUG - datasets.packaged_modules.json.json - Batch of 11115863 bytes couldn't be parsed with block_size=327680. Retrying with block_size=655360.
07/24/2024 16:24:47 - DEBUG - datasets.packaged_modules.json.json - Batch of 10498167 bytes couldn't be parsed with block_size=327680. Retrying with block_size=655360.
07/24/2024 16:24:47 - DEBUG - datasets.packaged_modules.json.json - Batch of 10530453 bytes couldn't be parsed with block_size=655360. Retrying with block_size=1310720.
07/24/2024 16:24:47 - DEBUG - datasets.packaged_modules.json.json - Batch of 10493913 bytes couldn't be parsed with block_size=327680. Retrying with block_size=655360.
07/24/2024 16:24:47 - DEBUG - datasets.packaged_modules.json.json - Batch of 10525926 bytes couldn't be parsed with block_size=327680. Retrying with block_size=655360.
07/24/2024 16:24:47 - DEBUG - datasets.packaged_modules.json.json - Batch of 10863935 bytes couldn't be parsed with block_size=327680. Retrying with block_size=655360.
07/24/2024 16:24:47 - DEBUG - datasets.packaged_modules.json.json - Batch of 10640425 bytes couldn't be parsed with block_size=327680. Retrying with block_size=655360.
07/24/2024 16:24:48 - DEBUG - datasets.packaged_modules.json.json - Batch of 10751338 bytes couldn't be parsed with block_size=327680. Retrying with block_size=655360.
07/24/2024 16:25:03 - INFO - __main__ - Step 1: {'lr': 0.0, 'samples': 48, 'steps': 0, 'loss/train': 10.554669380187988}
07/24/2024 16:25:04 - INFO - __main__ - Step 2: {'lr': 7.142857142857143e-07, 'samples': 96, 'steps': 1, 'loss/train': 10.494059562683105}
07/24/2024 16:25:04 - INFO - __main__ - Step 3: {'lr': 1.4285714285714286e-06, 'samples': 144, 'steps': 2, 'loss/train': 10.507988929748535}
07/24/2024 16:25:04 - INFO - __main__ - Step 4: {'lr': 2.142857142857143e-06, 'samples': 192, 'steps': 3, 'loss/train': 10.415447235107422}
07/24/2024 16:25:05 - INFO - __main__ - Step 5: {'lr': 2.8571428571428573e-06, 'samples': 240, 'steps': 4, 'loss/train': 10.345850944519043}
07/24/2024 16:25:05 - INFO - __main__ - Step 6: {'lr': 3.5714285714285714e-06, 'samples': 288, 'steps': 5, 'loss/train': 10.195524215698242}
07/24/2024 16:25:05 - INFO - __main__ - Step 7: {'lr': 4.285714285714286e-06, 'samples': 336, 'steps': 6, 'loss/train': 10.09341812133789}
07/24/2024 16:25:06 - INFO - __main__ - Step 8: {'lr': 5e-06, 'samples': 384, 'steps': 7, 'loss/train': 9.965239524841309}
07/24/2024 16:25:06 - INFO - __main__ - Step 9: {'lr': 5.7142857142857145e-06, 'samples': 432, 'steps': 8, 'loss/train': 9.698853492736816}
07/24/2024 16:25:06 - INFO - __main__ - Step 10: {'lr': 6.428571428571429e-06, 'samples': 480, 'steps': 9, 'loss/train': 9.80683708190918}
07/24/2024 16:25:06 - INFO - __main__ - Step 11: {'lr': 7.142857142857143e-06, 'samples': 528, 'steps': 10, 'loss/train': 9.633079528808594}
07/24/2024 16:25:07 - INFO - __main__ - Step 12: {'lr': 7.857142857142858e-06, 'samples': 576, 'steps': 11, 'loss/train': 9.700591087341309}
07/24/2024 16:25:07 - INFO - __main__ - Step 13: {'lr': 8.571428571428573e-06, 'samples': 624, 'steps': 12, 'loss/train': 9.603139877319336}
07/24/2024 16:25:07 - INFO - __main__ - Step 14: {'lr': 9.285714285714286e-06, 'samples': 672, 'steps': 13, 'loss/train': 9.30308723449707}
07/24/2024 16:25:08 - INFO - __main__ - Step 15: {'lr': 1e-05, 'samples': 720, 'steps': 14, 'loss/train': 9.333526611328125}
07/24/2024 16:25:08 - INFO - __main__ - Step 16: {'lr': 1.0714285714285714e-05, 'samples': 768, 'steps': 15, 'loss/train': 8.336181640625}
07/24/2024 16:25:08 - INFO - __main__ - Step 17: {'lr': 1.1428571428571429e-05, 'samples': 816, 'steps': 16, 'loss/train': 9.075631141662598}
07/24/2024 16:25:08 - INFO - __main__ - Step 18: {'lr': 1.2142857142857142e-05, 'samples': 864, 'steps': 17, 'loss/train': 9.18478012084961}
07/24/2024 16:25:09 - INFO - __main__ - Step 19: {'lr': 1.2857142857142857e-05, 'samples': 912, 'steps': 18, 'loss/train': 8.96328353881836}
07/24/2024 16:25:09 - INFO - __main__ - Step 20: {'lr': 1.3571428571428572e-05, 'samples': 960, 'steps': 19, 'loss/train': 9.45018196105957}
07/24/2024 16:25:09 - INFO - __main__ - Step 21: {'lr': 1.4285714285714285e-05, 'samples': 1008, 'steps': 20, 'loss/train': 8.517333984375}
07/24/2024 16:25:09 - INFO - __main__ - Step 22: {'lr': 1.5e-05, 'samples': 1056, 'steps': 21, 'loss/train': 9.207684516906738}
07/24/2024 16:25:10 - INFO - __main__ - Step 23: {'lr': 1.5714285714285715e-05, 'samples': 1104, 'steps': 22, 'loss/train': 8.681092262268066}
07/24/2024 16:25:10 - INFO - __main__ - Step 24: {'lr': 1.642857142857143e-05, 'samples': 1152, 'steps': 23, 'loss/train': 8.316036224365234}
07/24/2024 16:25:10 - INFO - __main__ - Step 25: {'lr': 1.7142857142857145e-05, 'samples': 1200, 'steps': 24, 'loss/train': 8.944169044494629}
07/24/2024 16:25:11 - INFO - __main__ - Step 26: {'lr': 1.7857142857142855e-05, 'samples': 1248, 'steps': 25, 'loss/train': 8.878201484680176}
07/24/2024 16:25:11 - INFO - __main__ - Step 27: {'lr': 1.8571428571428572e-05, 'samples': 1296, 'steps': 26, 'loss/train': 9.158102989196777}
07/24/2024 16:25:11 - INFO - __main__ - Step 28: {'lr': 1.9285714285714285e-05, 'samples': 1344, 'steps': 27, 'loss/train': 9.14354419708252}
07/24/2024 16:25:11 - INFO - __main__ - Step 29: {'lr': 2e-05, 'samples': 1392, 'steps': 28, 'loss/train': 8.860624313354492}
07/24/2024 16:25:12 - INFO - __main__ - Step 30: {'lr': 2.0714285714285715e-05, 'samples': 1440, 'steps': 29, 'loss/train': 8.876450538635254}
07/24/2024 16:25:12 - INFO - __main__ - Step 31: {'lr': 2.1428571428571428e-05, 'samples': 1488, 'steps': 30, 'loss/train': 8.425738334655762}
07/24/2024 16:25:12 - INFO - __main__ - Step 32: {'lr': 2.214285714285714e-05, 'samples': 1536, 'steps': 31, 'loss/train': 8.942279815673828}
07/24/2024 16:25:13 - INFO - __main__ - Step 33: {'lr': 2.2857142857142858e-05, 'samples': 1584, 'steps': 32, 'loss/train': 8.757084846496582}
07/24/2024 16:25:13 - INFO - __main__ - Step 34: {'lr': 2.3571428571428575e-05, 'samples': 1632, 'steps': 33, 'loss/train': 8.699286460876465}
07/24/2024 16:25:13 - INFO - __main__ - Step 35: {'lr': 2.4285714285714285e-05, 'samples': 1680, 'steps': 34, 'loss/train': 8.857367515563965}
07/24/2024 16:25:13 - INFO - __main__ - Step 36: {'lr': 2.5e-05, 'samples': 1728, 'steps': 35, 'loss/train': 8.830195426940918}
07/24/2024 16:25:14 - INFO - __main__ - Step 37: {'lr': 2.5714285714285714e-05, 'samples': 1776, 'steps': 36, 'loss/train': 8.944982528686523}
07/24/2024 16:25:14 - INFO - __main__ - Step 38: {'lr': 2.642857142857143e-05, 'samples': 1824, 'steps': 37, 'loss/train': 8.670278549194336}
07/24/2024 16:25:14 - INFO - __main__ - Step 39: {'lr': 2.7142857142857144e-05, 'samples': 1872, 'steps': 38, 'loss/train': 8.710525512695312}
07/24/2024 16:25:14 - INFO - __main__ - Step 40: {'lr': 2.7857142857142858e-05, 'samples': 1920, 'steps': 39, 'loss/train': 7.902089595794678}
07/24/2024 16:25:15 - INFO - __main__ - Step 41: {'lr': 2.857142857142857e-05, 'samples': 1968, 'steps': 40, 'loss/train': 8.400484085083008}
07/24/2024 16:25:15 - INFO - __main__ - Step 42: {'lr': 2.9285714285714288e-05, 'samples': 2016, 'steps': 41, 'loss/train': 8.789310455322266}
07/24/2024 16:25:15 - INFO - __main__ - Step 43: {'lr': 3e-05, 'samples': 2064, 'steps': 42, 'loss/train': 8.754344940185547}
07/24/2024 16:25:16 - INFO - __main__ - Step 44: {'lr': 3.071428571428572e-05, 'samples': 2112, 'steps': 43, 'loss/train': 8.84192943572998}
07/24/2024 16:25:16 - INFO - __main__ - Step 45: {'lr': 3.142857142857143e-05, 'samples': 2160, 'steps': 44, 'loss/train': 8.784793853759766}
07/24/2024 16:25:16 - INFO - __main__ - Step 46: {'lr': 3.214285714285714e-05, 'samples': 2208, 'steps': 45, 'loss/train': 8.67403793334961}
07/24/2024 16:25:16 - INFO - __main__ - Step 47: {'lr': 3.285714285714286e-05, 'samples': 2256, 'steps': 46, 'loss/train': 8.51427173614502}
07/24/2024 16:25:17 - INFO - __main__ - Step 48: {'lr': 3.357142857142857e-05, 'samples': 2304, 'steps': 47, 'loss/train': 8.48193073272705}
07/24/2024 16:25:17 - INFO - __main__ - Step 49: {'lr': 3.428571428571429e-05, 'samples': 2352, 'steps': 48, 'loss/train': 8.518038749694824}
07/24/2024 16:25:17 - INFO - __main__ - Step 50: {'lr': 3.5000000000000004e-05, 'samples': 2400, 'steps': 49, 'loss/train': 8.63569450378418}
07/24/2024 16:25:17 - INFO - __main__ - Evaluating and saving model checkpoint
07/24/2024 16:25:18 - DEBUG - datasets.iterable_dataset - dataloader worker#0, ': Starting to iterate over 1/1 shards.
07/24/2024 16:25:21 - INFO - __main__ - Step 50: {'loss/eval': 8.551246643066406, 'perplexity': 5173.19970703125}
07/24/2024 16:25:37 - WARNING - huggingface_hub.repository - fatal: could not read Username for 'https://huggingface.co': No such device or address
07/24/2024 16:42:21 - INFO - __main__ - Distributed environment: MULTI_GPU Backend: nccl
Num processes: 4
Process index: 0
Local process index: 0
Device: cuda:0
Mixed precision type: fp16
07/24/2024 16:42:21 - WARNING - huggingface_hub.repository - /dli/gptesla-small/./ is already a clone of https://huggingface.co/shng2025/gptesla-small. Make sure you pull the latest changes with `repo.git_pull()`.
07/24/2024 16:42:21 - WARNING - huggingface_hub.repository - Revision `cerulean-water-119` does not exist. Created and checked out branch `cerulean-water-119`.
07/24/2024 16:42:21 - WARNING - huggingface_hub.repository -
07/24/2024 16:42:23 - DEBUG - datasets.utils._dataset_viewer - Dataset info for shng2025/gptesla-train is not completely ready yet.
07/24/2024 16:42:23 - INFO - datasets.builder - No config specified, defaulting to the single config: gptesla-train/default
07/24/2024 16:42:23 - INFO - datasets.info - Loading Dataset Infos from /usr/local/lib/python3.10/dist-packages/datasets/packaged_modules/json
07/24/2024 16:42:28 - DEBUG - datasets.iterable_dataset - dataloader worker#0, ': Starting to iterate over 2/183 shards.
07/24/2024 16:42:28 - DEBUG - datasets.iterable_dataset - dataloader worker#1, ': Starting to iterate over 2/183 shards.
07/24/2024 16:42:28 - DEBUG - datasets.iterable_dataset - dataloader worker#2, ': Starting to iterate over 2/183 shards.
07/24/2024 16:42:29 - DEBUG - datasets.iterable_dataset - dataloader worker#4, ': Starting to iterate over 2/183 shards.
07/24/2024 16:42:29 - DEBUG - datasets.iterable_dataset - dataloader worker#3, ': Starting to iterate over 2/183 shards.
07/24/2024 16:42:29 - DEBUG - datasets.iterable_dataset - dataloader worker#6, ': Starting to iterate over 2/183 shards.
07/24/2024 16:42:29 - DEBUG - datasets.iterable_dataset - dataloader worker#7, ': Starting to iterate over 2/183 shards.
07/24/2024 16:42:29 - DEBUG - datasets.iterable_dataset - dataloader worker#5, ': Starting to iterate over 2/183 shards.
07/24/2024 16:42:29 - DEBUG - datasets.iterable_dataset - dataloader worker#8, ': Starting to iterate over 2/183 shards.
07/24/2024 16:42:29 - DEBUG - datasets.iterable_dataset - dataloader worker#11, ': Starting to iterate over 2/183 shards.
07/24/2024 16:42:29 - DEBUG - datasets.iterable_dataset - dataloader worker#10, ': Starting to iterate over 2/183 shards.
07/24/2024 16:42:29 - DEBUG - datasets.iterable_dataset - dataloader worker#9, ': Starting to iterate over 2/183 shards.
07/24/2024 16:42:29 - DEBUG - datasets.iterable_dataset - dataloader worker#12, ': Starting to iterate over 2/183 shards.
07/24/2024 16:42:29 - DEBUG - datasets.iterable_dataset - dataloader worker#14, ': Starting to iterate over 2/183 shards.
07/24/2024 16:42:29 - DEBUG - datasets.iterable_dataset - dataloader worker#13, ': Starting to iterate over 2/183 shards.
07/24/2024 16:42:29 - DEBUG - datasets.iterable_dataset - dataloader worker#15, ': Starting to iterate over 2/183 shards.
07/24/2024 16:42:29 - DEBUG - datasets.iterable_dataset - dataloader worker#21, ': Starting to iterate over 2/183 shards.
07/24/2024 16:42:29 - DEBUG - datasets.iterable_dataset - dataloader worker#16, ': Starting to iterate over 2/183 shards.
07/24/2024 16:42:29 - DEBUG - datasets.iterable_dataset - dataloader worker#19, ': Starting to iterate over 2/183 shards.
07/24/2024 16:42:29 - DEBUG - datasets.iterable_dataset - dataloader worker#22, ': Starting to iterate over 2/183 shards.
07/24/2024 16:42:29 - DEBUG - datasets.iterable_dataset - dataloader worker#20, ': Starting to iterate over 2/183 shards.
07/24/2024 16:42:29 - DEBUG - datasets.iterable_dataset - dataloader worker#17, ': Starting to iterate over 2/183 shards.
07/24/2024 16:42:29 - DEBUG - datasets.iterable_dataset - dataloader worker#18, ': Starting to iterate over 2/183 shards.
07/24/2024 16:42:29 - DEBUG - datasets.iterable_dataset - dataloader worker#23, ': Starting to iterate over 2/183 shards.
07/24/2024 16:42:29 - DEBUG - datasets.iterable_dataset - dataloader worker#25, ': Starting to iterate over 2/183 shards.
07/24/2024 16:42:29 - DEBUG - datasets.iterable_dataset - dataloader worker#24, ': Starting to iterate over 2/183 shards.
07/24/2024 16:42:29 - DEBUG - datasets.iterable_dataset - dataloader worker#26, ': Starting to iterate over 2/183 shards.
07/24/2024 16:42:29 - DEBUG - datasets.iterable_dataset - dataloader worker#27, ': Starting to iterate over 2/183 shards.
07/24/2024 16:42:29 - DEBUG - datasets.iterable_dataset - dataloader worker#29, ': Starting to iterate over 2/183 shards.
07/24/2024 16:42:29 - DEBUG - datasets.iterable_dataset - dataloader worker#30, ': Starting to iterate over 2/183 shards.
07/24/2024 16:42:29 - DEBUG - datasets.iterable_dataset - dataloader worker#31, ': Starting to iterate over 2/183 shards.
07/24/2024 16:42:29 - DEBUG - datasets.iterable_dataset - dataloader worker#32, ': Starting to iterate over 2/183 shards.
07/24/2024 16:42:29 - DEBUG - datasets.iterable_dataset - dataloader worker#28, ': Starting to iterate over 2/183 shards.
07/24/2024 16:42:29 - DEBUG - datasets.iterable_dataset - dataloader worker#33, ': Starting to iterate over 2/183 shards.
07/24/2024 16:42:29 - DEBUG - datasets.iterable_dataset - dataloader worker#34, ': Starting to iterate over 2/183 shards.
07/24/2024 16:42:29 - DEBUG - datasets.iterable_dataset - dataloader worker#35, ': Starting to iterate over 2/183 shards.
07/24/2024 16:42:29 - DEBUG - datasets.iterable_dataset - dataloader worker#36, ': Starting to iterate over 2/183 shards.
07/24/2024 16:42:29 - DEBUG - datasets.iterable_dataset - dataloader worker#37, ': Starting to iterate over 2/183 shards.
07/24/2024 16:42:29 - DEBUG - datasets.iterable_dataset - dataloader worker#38, ': Starting to iterate over 2/183 shards.
07/24/2024 16:42:29 - DEBUG - datasets.iterable_dataset - dataloader worker#39, ': Starting to iterate over 2/183 shards.
07/24/2024 16:42:29 - DEBUG - datasets.iterable_dataset - dataloader worker#40, ': Starting to iterate over 2/183 shards.
07/24/2024 16:42:29 - DEBUG - datasets.iterable_dataset - dataloader worker#41, ': Starting to iterate over 2/183 shards.
07/24/2024 16:42:29 - DEBUG - datasets.iterable_dataset - dataloader worker#42, ': Starting to iterate over 2/183 shards.
07/24/2024 16:42:29 - DEBUG - datasets.iterable_dataset - dataloader worker#43, ': Starting to iterate over 2/183 shards.
07/24/2024 16:42:29 - DEBUG - datasets.iterable_dataset - dataloader worker#44, ': Starting to iterate over 2/183 shards.
07/24/2024 16:42:29 - DEBUG - datasets.iterable_dataset - dataloader worker#45, ': Starting to iterate over 2/183 shards.
07/24/2024 16:42:29 - DEBUG - datasets.iterable_dataset - dataloader worker#46, ': Starting to iterate over 2/183 shards.
07/24/2024 16:42:29 - DEBUG - datasets.iterable_dataset - dataloader worker#48, ': Starting to iterate over 2/183 shards.
07/24/2024 16:42:29 - DEBUG - datasets.iterable_dataset - dataloader worker#47, ': Starting to iterate over 2/183 shards.
07/24/2024 16:42:29 - DEBUG - datasets.iterable_dataset - dataloader worker#49, ': Starting to iterate over 2/183 shards.
07/24/2024 16:42:29 - DEBUG - datasets.iterable_dataset - dataloader worker#50, ': Starting to iterate over 2/183 shards.
07/24/2024 16:42:29 - DEBUG - datasets.iterable_dataset - dataloader worker#51, ': Starting to iterate over 2/183 shards.
07/24/2024 16:42:29 - DEBUG - datasets.iterable_dataset - dataloader worker#52, ': Starting to iterate over 2/183 shards.
07/24/2024 16:42:29 - DEBUG - datasets.iterable_dataset - dataloader worker#53, ': Starting to iterate over 2/183 shards.
07/24/2024 16:42:29 - DEBUG - datasets.iterable_dataset - dataloader worker#55, ': Starting to iterate over 2/183 shards.
07/24/2024 16:42:29 - DEBUG - datasets.iterable_dataset - dataloader worker#54, ': Starting to iterate over 2/183 shards.
07/24/2024 16:42:29 - DEBUG - datasets.iterable_dataset - dataloader worker#56, ': Starting to iterate over 2/183 shards.
07/24/2024 16:42:29 - DEBUG - datasets.iterable_dataset - dataloader worker#57, ': Starting to iterate over 2/183 shards.
07/24/2024 16:42:29 - DEBUG - datasets.iterable_dataset - dataloader worker#58, ': Starting to iterate over 2/183 shards.
07/24/2024 16:42:29 - DEBUG - datasets.iterable_dataset - dataloader worker#59, ': Starting to iterate over 2/183 shards.
07/24/2024 16:42:29 - DEBUG - datasets.iterable_dataset - dataloader worker#60, ': Starting to iterate over 2/183 shards.
07/24/2024 16:42:29 - DEBUG - datasets.iterable_dataset - dataloader worker#61, ': Starting to iterate over 2/183 shards.
07/24/2024 16:42:29 - DEBUG - datasets.iterable_dataset - dataloader worker#62, ': Starting to iterate over 2/183 shards.
07/24/2024 16:42:29 - DEBUG - datasets.iterable_dataset - dataloader worker#63, ': Starting to iterate over 2/183 shards.
07/24/2024 16:42:29 - DEBUG - datasets.iterable_dataset - dataloader worker#64, ': Starting to iterate over 2/183 shards.
07/24/2024 16:42:29 - DEBUG - datasets.iterable_dataset - dataloader worker#65, ': Starting to iterate over 2/183 shards.
07/24/2024 16:42:29 - DEBUG - datasets.iterable_dataset - dataloader worker#66, ': Starting to iterate over 2/183 shards.
07/24/2024 16:42:29 - DEBUG - datasets.iterable_dataset - dataloader worker#67, ': Starting to iterate over 2/183 shards.
07/24/2024 16:42:29 - DEBUG - datasets.iterable_dataset - dataloader worker#68, ': Starting to iterate over 2/183 shards.
07/24/2024 16:42:29 - DEBUG - datasets.iterable_dataset - dataloader worker#69, ': Starting to iterate over 2/183 shards.
07/24/2024 16:42:29 - DEBUG - datasets.iterable_dataset - dataloader worker#70, ': Starting to iterate over 2/183 shards.
07/24/2024 16:42:29 - DEBUG - datasets.iterable_dataset - dataloader worker#72, ': Starting to iterate over 2/183 shards.
07/24/2024 16:42:29 - DEBUG - datasets.iterable_dataset - dataloader worker#71, ': Starting to iterate over 2/183 shards.
07/24/2024 16:42:29 - DEBUG - datasets.iterable_dataset - dataloader worker#73, ': Starting to iterate over 2/183 shards.
07/24/2024 16:42:29 - DEBUG - datasets.iterable_dataset - dataloader worker#75, ': Starting to iterate over 2/183 shards.
07/24/2024 16:42:29 - DEBUG - datasets.iterable_dataset - dataloader worker#76, ': Starting to iterate over 2/183 shards.
07/24/2024 16:42:29 - DEBUG - datasets.iterable_dataset - dataloader worker#74, ': Starting to iterate over 2/183 shards.
07/24/2024 16:42:29 - DEBUG - datasets.iterable_dataset - dataloader worker#77, ': Starting to iterate over 2/183 shards.
07/24/2024 16:42:29 - DEBUG - datasets.iterable_dataset - dataloader worker#78, ': Starting to iterate over 2/183 shards.
07/24/2024 16:42:29 - DEBUG - datasets.iterable_dataset - dataloader worker#79, ': Starting to iterate over 2/183 shards.
07/24/2024 16:42:29 - DEBUG - datasets.iterable_dataset - dataloader worker#81, ': Starting to iterate over 2/183 shards.
07/24/2024 16:42:29 - DEBUG - datasets.iterable_dataset - dataloader worker#82, ': Starting to iterate over 2/183 shards.
07/24/2024 16:42:29 - DEBUG - datasets.iterable_dataset - dataloader worker#80, ': Starting to iterate over 2/183 shards.
07/24/2024 16:42:29 - DEBUG - datasets.iterable_dataset - dataloader worker#83, ': Starting to iterate over 2/183 shards.
07/24/2024 16:42:29 - DEBUG - datasets.iterable_dataset - dataloader worker#84, ': Starting to iterate over 2/183 shards.
07/24/2024 16:42:29 - DEBUG - datasets.iterable_dataset - dataloader worker#85, ': Starting to iterate over 2/183 shards.
07/24/2024 16:42:29 - DEBUG - datasets.iterable_dataset - dataloader worker#86, ': Starting to iterate over 2/183 shards.
07/24/2024 16:42:29 - DEBUG - datasets.iterable_dataset - dataloader worker#87, ': Starting to iterate over 1/183 shards.
07/24/2024 16:42:29 - DEBUG - datasets.iterable_dataset - dataloader worker#88, ': Starting to iterate over 1/183 shards.
07/24/2024 16:42:29 - DEBUG - datasets.iterable_dataset - dataloader worker#89, ': Starting to iterate over 1/183 shards.
07/24/2024 16:42:29 - DEBUG - datasets.iterable_dataset - dataloader worker#90, ': Starting to iterate over 1/183 shards.
07/24/2024 16:42:29 - DEBUG - datasets.iterable_dataset - dataloader worker#91, ': Starting to iterate over 1/183 shards.
07/24/2024 16:42:29 - DEBUG - datasets.iterable_dataset - dataloader worker#92, ': Starting to iterate over 1/183 shards.
07/24/2024 16:42:29 - DEBUG - datasets.iterable_dataset - dataloader worker#93, ': Starting to iterate over 1/183 shards.
07/24/2024 16:42:29 - DEBUG - datasets.iterable_dataset - dataloader worker#94, ': Starting to iterate over 1/183 shards.
07/24/2024 16:42:29 - DEBUG - datasets.iterable_dataset - dataloader worker#95, ': Starting to iterate over 1/183 shards.
07/24/2024 16:42:29 - DEBUG - datasets.packaged_modules.json.json - Batch of 10488651 bytes couldn't be parsed with block_size=327680. Retrying with block_size=655360.
07/24/2024 16:42:29 - DEBUG - datasets.packaged_modules.json.json - Batch of 10486023 bytes couldn't be parsed with block_size=327680. Retrying with block_size=655360.
07/24/2024 16:42:29 - DEBUG - datasets.packaged_modules.json.json - Batch of 10488651 bytes couldn't be parsed with block_size=655360. Retrying with block_size=1310720.
07/24/2024 16:42:29 - DEBUG - datasets.packaged_modules.json.json - Batch of 10492277 bytes couldn't be parsed with block_size=327680. Retrying with block_size=655360.
07/24/2024 16:42:29 - DEBUG - datasets.packaged_modules.json.json - Batch of 10486023 bytes couldn't be parsed with block_size=655360. Retrying with block_size=1310720.
07/24/2024 16:42:29 - DEBUG - datasets.packaged_modules.json.json - Batch of 10562022 bytes couldn't be parsed with block_size=327680. Retrying with block_size=655360.
07/24/2024 16:42:29 - DEBUG - datasets.packaged_modules.json.json - Batch of 10488098 bytes couldn't be parsed with block_size=327680. Retrying with block_size=655360.
07/24/2024 16:42:30 - DEBUG - datasets.packaged_modules.json.json - Batch of 10485842 bytes couldn't be parsed with block_size=327680. Retrying with block_size=655360.
07/24/2024 16:42:30 - DEBUG - datasets.packaged_modules.json.json - Batch of 10486276 bytes couldn't be parsed with block_size=327680. Retrying with block_size=655360.
07/24/2024 16:42:30 - DEBUG - datasets.packaged_modules.json.json - Batch of 10486397 bytes couldn't be parsed with block_size=327680. Retrying with block_size=655360.
07/24/2024 16:42:30 - DEBUG - datasets.packaged_modules.json.json - Batch of 10492861 bytes couldn't be parsed with block_size=327680. Retrying with block_size=655360.
07/24/2024 16:42:30 - DEBUG - datasets.packaged_modules.json.json - Batch of 10487725 bytes couldn't be parsed with block_size=327680. Retrying with block_size=655360.
07/24/2024 16:42:30 - DEBUG - datasets.packaged_modules.json.json - Batch of 10492861 bytes couldn't be parsed with block_size=655360. Retrying with block_size=1310720.
07/24/2024 16:42:30 - DEBUG - datasets.packaged_modules.json.json - Batch of 10486801 bytes couldn't be parsed with block_size=327680. Retrying with block_size=655360.
07/24/2024 16:42:30 - DEBUG - datasets.packaged_modules.json.json - Batch of 10486616 bytes couldn't be parsed with block_size=327680. Retrying with block_size=655360.
07/24/2024 16:42:30 - DEBUG - datasets.packaged_modules.json.json - Batch of 10489575 bytes couldn't be parsed with block_size=327680. Retrying with block_size=655360.
07/24/2024 16:42:30 - DEBUG - datasets.packaged_modules.json.json - Batch of 10598254 bytes couldn't be parsed with block_size=327680. Retrying with block_size=655360.
07/24/2024 16:42:30 - DEBUG - datasets.packaged_modules.json.json - Batch of 10489599 bytes couldn't be parsed with block_size=327680. Retrying with block_size=655360.
07/24/2024 16:42:30 - DEBUG - datasets.packaged_modules.json.json - Batch of 10486616 bytes couldn't be parsed with block_size=655360. Retrying with block_size=1310720.
07/24/2024 16:42:30 - DEBUG - datasets.packaged_modules.json.json - Batch of 10501535 bytes couldn't be parsed with block_size=327680. Retrying with block_size=655360.
07/24/2024 16:42:30 - DEBUG - datasets.packaged_modules.json.json - Batch of 10668116 bytes couldn't be parsed with block_size=327680. Retrying with block_size=655360.
07/24/2024 16:42:30 - DEBUG - datasets.packaged_modules.json.json - Batch of 10500930 bytes couldn't be parsed with block_size=327680. Retrying with block_size=655360.
07/24/2024 16:42:30 - DEBUG - datasets.packaged_modules.json.json - Batch of 10489635 bytes couldn't be parsed with block_size=327680. Retrying with block_size=655360.
07/24/2024 16:42:30 - DEBUG - datasets.packaged_modules.json.json - Batch of 10525688 bytes couldn't be parsed with block_size=327680. Retrying with block_size=655360.
07/24/2024 16:42:30 - DEBUG - datasets.packaged_modules.json.json - Batch of 10499607 bytes couldn't be parsed with block_size=327680. Retrying with block_size=655360.
07/24/2024 16:42:30 - DEBUG - datasets.packaged_modules.json.json - Batch of 10512203 bytes couldn't be parsed with block_size=327680. Retrying with block_size=655360.
07/24/2024 16:42:30 - DEBUG - datasets.packaged_modules.json.json - Batch of 10485918 bytes couldn't be parsed with block_size=327680. Retrying with block_size=655360.
07/24/2024 16:42:30 - DEBUG - datasets.packaged_modules.json.json - Batch of 10751338 bytes couldn't be parsed with block_size=327680. Retrying with block_size=655360.
07/24/2024 16:42:30 - DEBUG - datasets.packaged_modules.json.json - Batch of 10487790 bytes couldn't be parsed with block_size=327680. Retrying with block_size=655360.
07/24/2024 16:42:30 - DEBUG - datasets.packaged_modules.json.json - Batch of 10485847 bytes couldn't be parsed with block_size=327680. Retrying with block_size=655360.
07/24/2024 16:42:30 - DEBUG - datasets.packaged_modules.json.json - Batch of 10486172 bytes couldn't be parsed with block_size=327680. Retrying with block_size=655360.
07/24/2024 16:42:30 - DEBUG - datasets.packaged_modules.json.json - Batch of 10497062 bytes couldn't be parsed with block_size=327680. Retrying with block_size=655360.
07/24/2024 16:42:30 - DEBUG - datasets.packaged_modules.json.json - Batch of 10511500 bytes couldn't be parsed with block_size=327680. Retrying with block_size=655360.
07/24/2024 16:42:30 - DEBUG - datasets.packaged_modules.json.json - Batch of 10497111 bytes couldn't be parsed with block_size=327680. Retrying with block_size=655360.
07/24/2024 16:42:30 - DEBUG - datasets.packaged_modules.json.json - Batch of 10498167 bytes couldn't be parsed with block_size=327680. Retrying with block_size=655360.
07/24/2024 16:42:30 - DEBUG - datasets.packaged_modules.json.json - Batch of 10497111 bytes couldn't be parsed with block_size=655360. Retrying with block_size=1310720.
07/24/2024 16:42:30 - DEBUG - datasets.packaged_modules.json.json - Batch of 10536479 bytes couldn't be parsed with block_size=327680. Retrying with block_size=655360.
07/24/2024 16:42:30 - DEBUG - datasets.packaged_modules.json.json - Batch of 10499106 bytes couldn't be parsed with block_size=327680. Retrying with block_size=655360.
07/24/2024 16:42:30 - DEBUG - datasets.packaged_modules.json.json - Batch of 10497218 bytes couldn't be parsed with block_size=327680. Retrying with block_size=655360.
07/24/2024 16:42:30 - DEBUG - datasets.packaged_modules.json.json - Batch of 10686322 bytes couldn't be parsed with block_size=327680. Retrying with block_size=655360.
07/24/2024 16:42:30 - DEBUG - datasets.packaged_modules.json.json - Batch of 10488385 bytes couldn't be parsed with block_size=327680. Retrying with block_size=655360.
07/24/2024 16:42:30 - DEBUG - datasets.packaged_modules.json.json - Batch of 10530453 bytes couldn't be parsed with block_size=655360. Retrying with block_size=1310720.
07/24/2024 16:42:30 - DEBUG - datasets.packaged_modules.json.json - Batch of 10949076 bytes couldn't be parsed with block_size=327680. Retrying with block_size=655360.
07/24/2024 16:42:30 - DEBUG - datasets.packaged_modules.json.json - Batch of 10487097 bytes couldn't be parsed with block_size=327680. Retrying with block_size=655360.
07/24/2024 16:42:30 - DEBUG - datasets.packaged_modules.json.json - Batch of 10610581 bytes couldn't be parsed with block_size=327680. Retrying with block_size=655360.
07/24/2024 16:42:30 - DEBUG - datasets.packaged_modules.json.json - Batch of 10553677 bytes couldn't be parsed with block_size=327680. Retrying with block_size=655360.
07/24/2024 16:42:30 - DEBUG - datasets.packaged_modules.json.json - Batch of 10488608 bytes couldn't be parsed with block_size=327680. Retrying with block_size=655360.
07/24/2024 16:42:30 - DEBUG - datasets.packaged_modules.json.json - Batch of 11286262 bytes couldn't be parsed with block_size=327680. Retrying with block_size=655360.
07/24/2024 16:42:30 - DEBUG - datasets.packaged_modules.json.json - Batch of 10525926 bytes couldn't be parsed with block_size=327680. Retrying with block_size=655360.
07/24/2024 16:42:30 - DEBUG - datasets.packaged_modules.json.json - Batch of 10485912 bytes couldn't be parsed with block_size=327680. Retrying with block_size=655360.
07/24/2024 16:42:30 - DEBUG - datasets.packaged_modules.json.json - Batch of 10485912 bytes couldn't be parsed with block_size=655360. Retrying with block_size=1310720.
07/24/2024 16:42:30 - DEBUG - datasets.packaged_modules.json.json - Batch of 10495973 bytes couldn't be parsed with block_size=327680. Retrying with block_size=655360.
07/24/2024 16:42:30 - DEBUG - datasets.packaged_modules.json.json - Batch of 10522596 bytes couldn't be parsed with block_size=327680. Retrying with block_size=655360.
07/24/2024 16:42:30 - DEBUG - datasets.packaged_modules.json.json - Batch of 10621496 bytes couldn't be parsed with block_size=327680. Retrying with block_size=655360.
07/24/2024 16:42:30 - DEBUG - datasets.packaged_modules.json.json - Batch of 10511515 bytes couldn't be parsed with block_size=327680. Retrying with block_size=655360.
07/24/2024 16:42:30 - DEBUG - datasets.packaged_modules.json.json - Batch of 10553677 bytes couldn't be parsed with block_size=655360. Retrying with block_size=1310720.
07/24/2024 16:42:30 - DEBUG - datasets.packaged_modules.json.json - Batch of 10491327 bytes couldn't be parsed with block_size=327680. Retrying with block_size=655360.
07/24/2024 16:42:30 - DEBUG - datasets.packaged_modules.json.json - Batch of 10487482 bytes couldn't be parsed with block_size=327680. Retrying with block_size=655360.
07/24/2024 16:42:30 - DEBUG - datasets.packaged_modules.json.json - Batch of 10509262 bytes couldn't be parsed with block_size=655360. Retrying with block_size=1310720.
07/24/2024 16:42:30 - DEBUG - datasets.packaged_modules.json.json - Batch of 10500290 bytes couldn't be parsed with block_size=327680. Retrying with block_size=655360.
07/24/2024 16:42:30 - DEBUG - datasets.packaged_modules.json.json - Batch of 11286262 bytes couldn't be parsed with block_size=655360. Retrying with block_size=1310720.
07/24/2024 16:42:30 - DEBUG - datasets.packaged_modules.json.json - Batch of 10863935 bytes couldn't be parsed with block_size=327680. Retrying with block_size=655360.
07/24/2024 16:42:31 - DEBUG - datasets.packaged_modules.json.json - Batch of 10493913 bytes couldn't be parsed with block_size=327680. Retrying with block_size=655360.
07/24/2024 16:42:31 - DEBUG - datasets.packaged_modules.json.json - Batch of 10497335 bytes couldn't be parsed with block_size=327680. Retrying with block_size=655360.
07/24/2024 16:42:31 - DEBUG - datasets.packaged_modules.json.json - Batch of 10495520 bytes couldn't be parsed with block_size=327680. Retrying with block_size=655360.
07/24/2024 16:42:31 - DEBUG - datasets.packaged_modules.json.json - Batch of 10492554 bytes couldn't be parsed with block_size=327680. Retrying with block_size=655360.
07/24/2024 16:42:31 - DEBUG - datasets.packaged_modules.json.json - Batch of 10495520 bytes couldn't be parsed with block_size=655360. Retrying with block_size=1310720.
07/24/2024 16:42:31 - DEBUG - datasets.packaged_modules.json.json - Batch of 10491889 bytes couldn't be parsed with block_size=655360. Retrying with block_size=1310720.
07/24/2024 16:42:31 - DEBUG - datasets.packaged_modules.json.json - Batch of 10515063 bytes couldn't be parsed with block_size=327680. Retrying with block_size=655360.
07/24/2024 16:42:31 - DEBUG - datasets.packaged_modules.json.json - Batch of 10509286 bytes couldn't be parsed with block_size=327680. Retrying with block_size=655360.
07/24/2024 16:42:31 - DEBUG - datasets.packaged_modules.json.json - Batch of 10509286 bytes couldn't be parsed with block_size=655360. Retrying with block_size=1310720.
07/24/2024 16:42:31 - DEBUG - datasets.packaged_modules.json.json - Batch of 10640425 bytes couldn't be parsed with block_size=327680. Retrying with block_size=655360.
07/24/2024 16:42:31 - DEBUG - datasets.packaged_modules.json.json - Batch of 10491547 bytes couldn't be parsed with block_size=327680. Retrying with block_size=655360.
07/24/2024 16:42:31 - DEBUG - datasets.packaged_modules.json.json - Batch of 10676628 bytes couldn't be parsed with block_size=327680. Retrying with block_size=655360.
07/24/2024 16:42:31 - DEBUG - datasets.packaged_modules.json.json - Batch of 10488150 bytes couldn't be parsed with block_size=327680. Retrying with block_size=655360.
07/24/2024 16:42:31 - DEBUG - datasets.packaged_modules.json.json - Batch of 11115863 bytes couldn't be parsed with block_size=327680. Retrying with block_size=655360.
07/24/2024 16:42:31 - DEBUG - datasets.packaged_modules.json.json - Batch of 10552417 bytes couldn't be parsed with block_size=327680. Retrying with block_size=655360.
07/24/2024 16:42:31 - DEBUG - datasets.packaged_modules.json.json - Batch of 10552417 bytes couldn't be parsed with block_size=655360. Retrying with block_size=1310720.
07/24/2024 16:42:31 - DEBUG - datasets.packaged_modules.json.json - Batch of 10491272 bytes couldn't be parsed with block_size=327680. Retrying with block_size=655360.
07/24/2024 16:42:31 - DEBUG - datasets.packaged_modules.json.json - Batch of 10511604 bytes couldn't be parsed with block_size=655360. Retrying with block_size=1310720.
07/24/2024 16:42:45 - INFO - __main__ - Step 1: {'lr': 0.0, 'samples': 48, 'steps': 0, 'loss/train': 8.409734725952148}
07/24/2024 16:42:45 - INFO - __main__ - Step 2: {'lr': 7.142857142857143e-07, 'samples': 96, 'steps': 1, 'loss/train': 8.542716979980469}
07/24/2024 16:42:45 - INFO - __main__ - Step 3: {'lr': 1.4285714285714286e-06, 'samples': 144, 'steps': 2, 'loss/train': 8.60405158996582}
07/24/2024 16:42:46 - INFO - __main__ - Step 4: {'lr': 2.142857142857143e-06, 'samples': 192, 'steps': 3, 'loss/train': 8.401007652282715}
07/24/2024 16:42:46 - INFO - __main__ - Step 5: {'lr': 2.8571428571428573e-06, 'samples': 240, 'steps': 4, 'loss/train': 8.732222557067871}
07/24/2024 16:42:46 - INFO - __main__ - Step 6: {'lr': 3.5714285714285714e-06, 'samples': 288, 'steps': 5, 'loss/train': 8.438238143920898}
07/24/2024 16:42:46 - INFO - __main__ - Step 7: {'lr': 4.285714285714286e-06, 'samples': 336, 'steps': 6, 'loss/train': 8.689836502075195}
07/24/2024 16:42:47 - INFO - __main__ - Step 8: {'lr': 5e-06, 'samples': 384, 'steps': 7, 'loss/train': 8.583974838256836}
07/24/2024 16:42:47 - INFO - __main__ - Step 9: {'lr': 5.7142857142857145e-06, 'samples': 432, 'steps': 8, 'loss/train': 8.271807670593262}
07/24/2024 16:42:47 - INFO - __main__ - Step 10: {'lr': 6.428571428571429e-06, 'samples': 480, 'steps': 9, 'loss/train': 8.642550468444824}
07/24/2024 16:42:47 - INFO - __main__ - Step 11: {'lr': 7.142857142857143e-06, 'samples': 528, 'steps': 10, 'loss/train': 8.518206596374512}
07/24/2024 16:42:48 - INFO - __main__ - Step 12: {'lr': 7.857142857142858e-06, 'samples': 576, 'steps': 11, 'loss/train': 8.762425422668457}
07/24/2024 16:42:48 - INFO - __main__ - Step 13: {'lr': 8.571428571428573e-06, 'samples': 624, 'steps': 12, 'loss/train': 8.473711967468262}
07/24/2024 16:42:48 - INFO - __main__ - Step 14: {'lr': 9.285714285714286e-06, 'samples': 672, 'steps': 13, 'loss/train': 8.282694816589355}
07/24/2024 16:42:49 - INFO - __main__ - Step 15: {'lr': 1e-05, 'samples': 720, 'steps': 14, 'loss/train': 8.382986068725586}
07/24/2024 16:42:49 - INFO - __main__ - Step 16: {'lr': 1.0714285714285714e-05, 'samples': 768, 'steps': 15, 'loss/train': 7.3987932205200195}
07/24/2024 16:42:49 - INFO - __main__ - Step 17: {'lr': 1.1428571428571429e-05, 'samples': 816, 'steps': 16, 'loss/train': 8.041902542114258}
07/24/2024 16:42:49 - INFO - __main__ - Step 18: {'lr': 1.2142857142857142e-05, 'samples': 864, 'steps': 17, 'loss/train': 8.312195777893066}
07/24/2024 16:42:50 - INFO - __main__ - Step 19: {'lr': 1.2857142857142857e-05, 'samples': 912, 'steps': 18, 'loss/train': 8.101341247558594}
07/24/2024 16:42:50 - INFO - __main__ - Step 20: {'lr': 1.3571428571428572e-05, 'samples': 960, 'steps': 19, 'loss/train': 8.539198875427246}
07/24/2024 16:42:50 - INFO - __main__ - Step 21: {'lr': 1.4285714285714285e-05, 'samples': 1008, 'steps': 20, 'loss/train': 7.702454090118408}
07/24/2024 16:42:51 - INFO - __main__ - Step 22: {'lr': 1.5e-05, 'samples': 1056, 'steps': 21, 'loss/train': 8.377392768859863}
07/24/2024 16:42:51 - INFO - __main__ - Step 23: {'lr': 1.5714285714285715e-05, 'samples': 1104, 'steps': 22, 'loss/train': 7.6457695960998535}
07/24/2024 16:42:51 - INFO - __main__ - Step 24: {'lr': 1.642857142857143e-05, 'samples': 1152, 'steps': 23, 'loss/train': 7.307077407836914}
07/24/2024 16:42:51 - INFO - __main__ - Step 25: {'lr': 1.7142857142857145e-05, 'samples': 1200, 'steps': 24, 'loss/train': 8.0068941116333}
07/24/2024 16:42:52 - INFO - __main__ - Step 26: {'lr': 1.7857142857142855e-05, 'samples': 1248, 'steps': 25, 'loss/train': 7.943492412567139}
07/24/2024 16:42:52 - INFO - __main__ - Step 27: {'lr': 1.8571428571428572e-05, 'samples': 1296, 'steps': 26, 'loss/train': 8.152873992919922}
07/24/2024 16:42:52 - INFO - __main__ - Step 28: {'lr': 1.9285714285714285e-05, 'samples': 1344, 'steps': 27, 'loss/train': 8.336533546447754}
07/24/2024 16:42:52 - INFO - __main__ - Step 29: {'lr': 2e-05, 'samples': 1392, 'steps': 28, 'loss/train': 8.025687217712402}
07/24/2024 16:42:53 - INFO - __main__ - Step 30: {'lr': 2.0714285714285715e-05, 'samples': 1440, 'steps': 29, 'loss/train': 7.878050327301025}
07/24/2024 16:42:53 - INFO - __main__ - Step 31: {'lr': 2.1428571428571428e-05, 'samples': 1488, 'steps': 30, 'loss/train': 7.359142780303955}
07/24/2024 16:42:53 - INFO - __main__ - Step 32: {'lr': 2.214285714285714e-05, 'samples': 1536, 'steps': 31, 'loss/train': 8.03093433380127}
07/24/2024 16:42:54 - INFO - __main__ - Step 33: {'lr': 2.2857142857142858e-05, 'samples': 1584, 'steps': 32, 'loss/train': 7.825865745544434}
07/24/2024 16:42:54 - INFO - __main__ - Step 34: {'lr': 2.3571428571428575e-05, 'samples': 1632, 'steps': 33, 'loss/train': 7.730936050415039}
07/24/2024 16:42:54 - INFO - __main__ - Step 35: {'lr': 2.4285714285714285e-05, 'samples': 1680, 'steps': 34, 'loss/train': 7.843356132507324}
07/24/2024 16:42:54 - INFO - __main__ - Step 36: {'lr': 2.5e-05, 'samples': 1728, 'steps': 35, 'loss/train': 7.827062606811523}
07/24/2024 16:42:55 - INFO - __main__ - Step 37: {'lr': 2.5714285714285714e-05, 'samples': 1776, 'steps': 36, 'loss/train': 7.771824359893799}
07/24/2024 16:42:55 - INFO - __main__ - Step 38: {'lr': 2.642857142857143e-05, 'samples': 1824, 'steps': 37, 'loss/train': 7.620804786682129}
07/24/2024 16:42:55 - INFO - __main__ - Step 39: {'lr': 2.7142857142857144e-05, 'samples': 1872, 'steps': 38, 'loss/train': 7.73307991027832}
07/24/2024 16:42:56 - INFO - __main__ - Step 40: {'lr': 2.7857142857142858e-05, 'samples': 1920, 'steps': 39, 'loss/train': 6.725803852081299}
07/24/2024 16:42:56 - INFO - __main__ - Step 41: {'lr': 2.857142857142857e-05, 'samples': 1968, 'steps': 40, 'loss/train': 7.44889497756958}
07/24/2024 16:42:56 - INFO - __main__ - Step 42: {'lr': 2.9285714285714288e-05, 'samples': 2016, 'steps': 41, 'loss/train': 7.854197025299072}
07/24/2024 16:42:56 - INFO - __main__ - Step 43: {'lr': 3e-05, 'samples': 2064, 'steps': 42, 'loss/train': 7.926906585693359}
07/24/2024 16:42:57 - INFO - __main__ - Step 44: {'lr': 3.071428571428572e-05, 'samples': 2112, 'steps': 43, 'loss/train': 7.857071876525879}
07/24/2024 16:42:57 - INFO - __main__ - Step 45: {'lr': 3.142857142857143e-05, 'samples': 2160, 'steps': 44, 'loss/train': 7.80991792678833}
07/24/2024 16:42:57 - INFO - __main__ - Step 46: {'lr': 3.214285714285714e-05, 'samples': 2208, 'steps': 45, 'loss/train': 7.429677963256836}
07/24/2024 16:42:57 - INFO - __main__ - Step 47: {'lr': 3.285714285714286e-05, 'samples': 2256, 'steps': 46, 'loss/train': 7.601736068725586}
07/24/2024 16:42:58 - INFO - __main__ - Step 48: {'lr': 3.357142857142857e-05, 'samples': 2304, 'steps': 47, 'loss/train': 7.540863037109375}
07/24/2024 16:42:58 - INFO - __main__ - Step 49: {'lr': 3.428571428571429e-05, 'samples': 2352, 'steps': 48, 'loss/train': 7.452314376831055}
07/24/2024 16:42:58 - INFO - __main__ - Step 50: {'lr': 3.5000000000000004e-05, 'samples': 2400, 'steps': 49, 'loss/train': 7.728540897369385}
07/24/2024 16:42:58 - INFO - __main__ - Evaluating and saving model checkpoint
07/24/2024 16:42:59 - DEBUG - datasets.iterable_dataset - dataloader worker#0, ': Starting to iterate over 1/1 shards.
07/24/2024 16:43:02 - INFO - __main__ - Step 50: {'loss/eval': 7.611824989318848, 'perplexity': 2021.9647216796875}
07/24/2024 16:43:18 - WARNING - huggingface_hub.repository - Several commits (2) will be pushed upstream.
07/24/2024 16:43:18 - WARNING - huggingface_hub.repository - The progress bars may be unreliable.
07/24/2024 16:43:32 - WARNING - huggingface_hub.repository - To https://huggingface.co/shng2025/gptesla-small
4d63b0c..caa5803 cerulean-water-119 -> cerulean-water-119
07/24/2024 16:43:33 - INFO - __main__ - Step 51: {'lr': 3.571428571428571e-05, 'samples': 2448, 'steps': 50, 'loss/train': 6.487676620483398}
07/24/2024 16:43:33 - INFO - __main__ - Step 52: {'lr': 3.642857142857143e-05, 'samples': 2496, 'steps': 51, 'loss/train': 7.67896032333374}
07/24/2024 16:43:34 - INFO - __main__ - Step 53: {'lr': 3.7142857142857143e-05, 'samples': 2544, 'steps': 52, 'loss/train': 7.545412540435791}
07/24/2024 16:43:34 - INFO - __main__ - Step 54: {'lr': 3.7857142857142864e-05, 'samples': 2592, 'steps': 53, 'loss/train': 7.630044937133789}
07/24/2024 16:43:34 - INFO - __main__ - Step 55: {'lr': 3.857142857142857e-05, 'samples': 2640, 'steps': 54, 'loss/train': 7.491697311401367}
07/24/2024 16:43:34 - INFO - __main__ - Step 56: {'lr': 3.928571428571428e-05, 'samples': 2688, 'steps': 55, 'loss/train': 7.302150249481201}
07/24/2024 16:43:35 - INFO - __main__ - Step 57: {'lr': 4e-05, 'samples': 2736, 'steps': 56, 'loss/train': 7.448665142059326}
07/24/2024 16:43:35 - INFO - __main__ - Step 58: {'lr': 4.0714285714285717e-05, 'samples': 2784, 'steps': 57, 'loss/train': 6.885580062866211}
07/24/2024 16:43:35 - INFO - __main__ - Step 59: {'lr': 4.142857142857143e-05, 'samples': 2832, 'steps': 58, 'loss/train': 7.71057653427124}
07/24/2024 16:43:36 - INFO - __main__ - Step 60: {'lr': 4.214285714285714e-05, 'samples': 2880, 'steps': 59, 'loss/train': 7.289674282073975}
07/24/2024 16:43:36 - INFO - __main__ - Step 61: {'lr': 4.2857142857142856e-05, 'samples': 2928, 'steps': 60, 'loss/train': 7.388137340545654}
07/24/2024 16:43:36 - INFO - __main__ - Step 62: {'lr': 4.3571428571428576e-05, 'samples': 2976, 'steps': 61, 'loss/train': 7.354274749755859}
07/24/2024 16:43:36 - INFO - __main__ - Step 63: {'lr': 4.428571428571428e-05, 'samples': 3024, 'steps': 62, 'loss/train': 7.121933937072754}
07/24/2024 16:43:37 - INFO - __main__ - Step 64: {'lr': 4.4999999999999996e-05, 'samples': 3072, 'steps': 63, 'loss/train': 6.061006546020508}
07/24/2024 16:43:37 - INFO - __main__ - Step 65: {'lr': 4.5714285714285716e-05, 'samples': 3120, 'steps': 64, 'loss/train': 7.104621887207031}
07/24/2024 16:43:37 - INFO - __main__ - Step 66: {'lr': 4.642857142857143e-05, 'samples': 3168, 'steps': 65, 'loss/train': 6.724586486816406}
07/24/2024 16:43:38 - INFO - __main__ - Step 67: {'lr': 4.714285714285715e-05, 'samples': 3216, 'steps': 66, 'loss/train': 6.689899921417236}
07/24/2024 16:43:38 - INFO - __main__ - Step 68: {'lr': 4.7857142857142856e-05, 'samples': 3264, 'steps': 67, 'loss/train': 7.3340067863464355}
07/24/2024 16:43:38 - INFO - __main__ - Step 69: {'lr': 4.857142857142857e-05, 'samples': 3312, 'steps': 68, 'loss/train': 7.200557231903076}
07/24/2024 16:43:38 - INFO - __main__ - Step 70: {'lr': 4.928571428571429e-05, 'samples': 3360, 'steps': 69, 'loss/train': 6.917683124542236}
07/24/2024 16:43:39 - INFO - __main__ - Step 71: {'lr': 5e-05, 'samples': 3408, 'steps': 70, 'loss/train': 7.259862899780273}
07/24/2024 16:43:39 - INFO - __main__ - Step 72: {'lr': 5.0714285714285716e-05, 'samples': 3456, 'steps': 71, 'loss/train': 6.847894191741943}
07/24/2024 16:43:39 - INFO - __main__ - Step 73: {'lr': 5.142857142857143e-05, 'samples': 3504, 'steps': 72, 'loss/train': 7.104192733764648}
07/24/2024 16:43:40 - INFO - __main__ - Step 74: {'lr': 5.214285714285714e-05, 'samples': 3552, 'steps': 73, 'loss/train': 6.7482452392578125}
07/24/2024 16:43:40 - INFO - __main__ - Step 75: {'lr': 5.285714285714286e-05, 'samples': 3600, 'steps': 74, 'loss/train': 6.932758808135986}
07/24/2024 16:43:40 - INFO - __main__ - Step 76: {'lr': 5.357142857142857e-05, 'samples': 3648, 'steps': 75, 'loss/train': 7.208134651184082}
07/24/2024 16:43:40 - INFO - __main__ - Step 77: {'lr': 5.428571428571429e-05, 'samples': 3696, 'steps': 76, 'loss/train': 7.33575439453125}
07/24/2024 16:43:41 - INFO - __main__ - Step 78: {'lr': 5.5e-05, 'samples': 3744, 'steps': 77, 'loss/train': 6.378943920135498}
07/24/2024 16:43:41 - INFO - __main__ - Step 79: {'lr': 5.5714285714285715e-05, 'samples': 3792, 'steps': 78, 'loss/train': 7.227607250213623}
07/24/2024 16:43:41 - INFO - __main__ - Step 80: {'lr': 5.642857142857143e-05, 'samples': 3840, 'steps': 79, 'loss/train': 6.442720890045166}
07/24/2024 16:43:41 - INFO - __main__ - Step 81: {'lr': 5.714285714285714e-05, 'samples': 3888, 'steps': 80, 'loss/train': 7.17632532119751}
07/24/2024 16:43:42 - INFO - __main__ - Step 82: {'lr': 5.7857142857142855e-05, 'samples': 3936, 'steps': 81, 'loss/train': 6.7079668045043945}
07/24/2024 16:43:42 - INFO - __main__ - Step 83: {'lr': 5.8571428571428575e-05, 'samples': 3984, 'steps': 82, 'loss/train': 6.951054096221924}
07/24/2024 16:43:42 - INFO - __main__ - Step 84: {'lr': 5.928571428571429e-05, 'samples': 4032, 'steps': 83, 'loss/train': 7.006659507751465}
07/24/2024 16:43:43 - INFO - __main__ - Step 85: {'lr': 6e-05, 'samples': 4080, 'steps': 84, 'loss/train': 6.719784736633301}
07/24/2024 16:43:43 - INFO - __main__ - Step 86: {'lr': 6.0714285714285715e-05, 'samples': 4128, 'steps': 85, 'loss/train': 6.775872230529785}
07/24/2024 16:43:43 - INFO - __main__ - Step 87: {'lr': 6.142857142857143e-05, 'samples': 4176, 'steps': 86, 'loss/train': 6.850857257843018}
07/24/2024 16:43:43 - INFO - __main__ - Step 88: {'lr': 6.214285714285714e-05, 'samples': 4224, 'steps': 87, 'loss/train': 5.614303112030029}
07/24/2024 16:43:44 - INFO - __main__ - Step 89: {'lr': 6.285714285714286e-05, 'samples': 4272, 'steps': 88, 'loss/train': 6.760764122009277}
07/24/2024 16:43:44 - INFO - __main__ - Step 90: {'lr': 6.357142857142857e-05, 'samples': 4320, 'steps': 89, 'loss/train': 6.488976955413818}
07/24/2024 16:43:44 - INFO - __main__ - Step 91: {'lr': 6.428571428571427e-05, 'samples': 4368, 'steps': 90, 'loss/train': 6.214510917663574}
07/24/2024 16:43:45 - INFO - __main__ - Step 92: {'lr': 6.500000000000001e-05, 'samples': 4416, 'steps': 91, 'loss/train': 7.034832954406738}
07/24/2024 16:43:45 - INFO - __main__ - Step 93: {'lr': 6.571428571428571e-05, 'samples': 4464, 'steps': 92, 'loss/train': 6.2593488693237305}
07/24/2024 16:43:45 - INFO - __main__ - Step 94: {'lr': 6.642857142857143e-05, 'samples': 4512, 'steps': 93, 'loss/train': 7.205167293548584}
07/24/2024 16:43:45 - INFO - __main__ - Step 95: {'lr': 6.714285714285714e-05, 'samples': 4560, 'steps': 94, 'loss/train': 6.675778865814209}
07/24/2024 16:43:46 - INFO - __main__ - Step 96: {'lr': 6.785714285714285e-05, 'samples': 4608, 'steps': 95, 'loss/train': 4.166206359863281}
07/24/2024 16:43:46 - INFO - __main__ - Step 97: {'lr': 6.857142857142858e-05, 'samples': 4656, 'steps': 96, 'loss/train': 6.848745346069336}
07/24/2024 16:43:46 - INFO - __main__ - Step 98: {'lr': 6.928571428571429e-05, 'samples': 4704, 'steps': 97, 'loss/train': 6.357327461242676}
07/24/2024 16:43:47 - INFO - __main__ - Step 99: {'lr': 7.000000000000001e-05, 'samples': 4752, 'steps': 98, 'loss/train': 6.601438999176025}
07/24/2024 16:43:47 - INFO - __main__ - Step 100: {'lr': 7.071428571428571e-05, 'samples': 4800, 'steps': 99, 'loss/train': 6.914941310882568}
07/24/2024 16:43:47 - INFO - __main__ - Evaluating and saving model checkpoint
07/24/2024 16:43:47 - DEBUG - datasets.iterable_dataset - dataloader worker#0, ': Starting to iterate over 1/1 shards.
07/24/2024 16:43:51 - INFO - __main__ - Step 100: {'loss/eval': 6.708734035491943, 'perplexity': 819.532470703125}