[2024-08-05 05:43:40,662][00034] Saving configuration to /kaggle/working/train_dir/default_experiment/config.json... [2024-08-05 05:43:40,664][00034] Rollout worker 0 uses device cpu [2024-08-05 05:43:40,832][00034] Using GPUs [0] for process 0 (actually maps to GPUs [0]) [2024-08-05 05:43:40,834][00034] InferenceWorker_p0-w0: min num requests: 1 [2024-08-05 05:43:40,840][00034] Starting all processes... [2024-08-05 05:43:40,841][00034] Starting process learner_proc0 [2024-08-05 05:43:40,940][00034] Starting all processes... [2024-08-05 05:43:40,947][00034] Starting process inference_proc0-0 [2024-08-05 05:43:40,948][00034] Starting process rollout_proc0 [2024-08-05 05:43:43,852][00138] Using GPUs [0] for process 0 (actually maps to GPUs [0]) [2024-08-05 05:43:43,853][00138] Set environment var CUDA_VISIBLE_DEVICES to '0' (GPU indices [0]) for inference process 0 [2024-08-05 05:43:43,870][00138] Num visible devices: 1 [2024-08-05 05:43:44,132][00132] Using GPUs [0] for process 0 (actually maps to GPUs [0]) [2024-08-05 05:43:44,132][00132] Set environment var CUDA_VISIBLE_DEVICES to '0' (GPU indices [0]) for learning process 0 [2024-08-05 05:43:44,149][00132] Num visible devices: 1 [2024-08-05 05:43:44,168][00132] Setting fixed seed 0 [2024-08-05 05:43:44,171][00132] Using GPUs [0] for process 0 (actually maps to GPUs [0]) [2024-08-05 05:43:44,172][00132] Initializing actor-critic model on device cuda:0 [2024-08-05 05:43:44,172][00132] RunningMeanStd input shape: (23,) [2024-08-05 05:43:44,175][00132] RunningMeanStd input shape: (3, 72, 128) [2024-08-05 05:43:44,175][00132] RunningMeanStd input shape: (1,) [2024-08-05 05:43:44,186][00139] Worker 0 uses CPU cores [0, 1, 2, 3] [2024-08-05 05:43:44,192][00132] ConvEncoder: input_channels=3 [2024-08-05 05:43:44,430][00132] Conv encoder output size: 512 [2024-08-05 05:43:44,431][00132] Policy head output size: 640 [2024-08-05 05:43:44,511][00132] Created Actor Critic model with architecture: [2024-08-05 05:43:44,511][00132] ActorCriticSharedWeights( (obs_normalizer): ObservationNormalizer( (running_mean_std): RunningMeanStdDictInPlace( (running_mean_std): ModuleDict( (measurements): RunningMeanStdInPlace() (obs): RunningMeanStdInPlace() ) ) ) (returns_normalizer): RecursiveScriptModule(original_name=RunningMeanStdInPlace) (encoder): VizdoomEncoder( (basic_encoder): ConvEncoder( (enc): RecursiveScriptModule( original_name=ConvEncoderImpl (conv_head): RecursiveScriptModule( original_name=Sequential (0): RecursiveScriptModule(original_name=Conv2d) (1): RecursiveScriptModule(original_name=ReLU) (2): RecursiveScriptModule(original_name=Conv2d) (3): RecursiveScriptModule(original_name=ReLU) (4): RecursiveScriptModule(original_name=Conv2d) (5): RecursiveScriptModule(original_name=ReLU) ) (mlp_layers): RecursiveScriptModule( original_name=Sequential (0): RecursiveScriptModule(original_name=Linear) (1): RecursiveScriptModule(original_name=ReLU) ) ) ) (measurements_head): Sequential( (0): Linear(in_features=23, out_features=128, bias=True) (1): ReLU() (2): Linear(in_features=128, out_features=128, bias=True) (3): ReLU() ) ) (core): ModelCoreRNN( (core): LSTM(640, 512) ) (decoder): MlpDecoder( (mlp): Identity() ) (critic_linear): Linear(in_features=512, out_features=1, bias=True) (action_parameterization): ActionParameterizationDefault( (distribution_linear): Linear(in_features=512, out_features=39, bias=True) ) ) [2024-08-05 05:43:44,765][00132] Using optimizer [2024-08-05 05:43:45,821][00132] No checkpoints found [2024-08-05 05:43:45,821][00132] Did not load from checkpoint, starting from scratch! [2024-08-05 05:43:45,822][00132] Initialized policy 0 weights for model version 0 [2024-08-05 05:43:45,824][00132] LearnerWorker_p0 finished initialization! [2024-08-05 05:43:45,825][00132] Using GPUs [0] for process 0 (actually maps to GPUs [0]) [2024-08-05 05:43:45,911][00138] RunningMeanStd input shape: (23,) [2024-08-05 05:43:45,912][00138] RunningMeanStd input shape: (3, 72, 128) [2024-08-05 05:43:45,912][00138] RunningMeanStd input shape: (1,) [2024-08-05 05:43:45,928][00138] ConvEncoder: input_channels=3 [2024-08-05 05:43:46,047][00138] Conv encoder output size: 512 [2024-08-05 05:43:46,048][00138] Policy head output size: 640 [2024-08-05 05:43:46,121][00034] Inference worker 0-0 is ready! [2024-08-05 05:43:46,123][00034] All inference workers are ready! Signal rollout workers to start! [2024-08-05 05:43:46,158][00139] Doom resolution: 160x120, resize resolution: (128, 72) [2024-08-05 05:43:46,160][00139] Port 40300 is available [2024-08-05 05:43:46,160][00139] Using port 40300 [2024-08-05 05:43:46,164][00139] Using port 40300 on host... [2024-08-05 05:43:46,531][00139] Initialized w:0 v:0 player:0 [2024-08-05 05:43:46,540][00139] Decorrelating experience for 0 frames... [2024-08-05 05:43:46,570][00139] Port 40301 is available [2024-08-05 05:43:46,570][00139] Using port 40301 [2024-08-05 05:43:46,571][00139] Using port 40301 on host... [2024-08-05 05:43:46,880][00139] Initialized w:0 v:1 player:0 [2024-08-05 05:43:46,881][00139] Decorrelating experience for 32 frames... [2024-08-05 05:43:50,483][00034] Fps is (10 sec: nan, 60 sec: nan, 300 sec: nan). Total num frames: 0. Throughput: 0: nan. Samples: 381. Policy #0 lag: (min: -1.0, avg: -1.0, max: -1.0) [2024-08-05 05:43:55,483][00034] Fps is (10 sec: 0.0, 60 sec: 0.0, 300 sec: 0.0). Total num frames: 0. Throughput: 0: 338.4. Samples: 2073. Policy #0 lag: (min: -1.0, avg: -1.0, max: -1.0) [2024-08-05 05:44:00,483][00034] Fps is (10 sec: 0.0, 60 sec: 0.0, 300 sec: 0.0). Total num frames: 0. Throughput: 0: 217.5. Samples: 2556. Policy #0 lag: (min: -1.0, avg: -1.0, max: -1.0) [2024-08-05 05:44:00,824][00034] Heartbeat connected on Batcher_0 [2024-08-05 05:44:00,828][00034] Heartbeat connected on LearnerWorker_p0 [2024-08-05 05:44:00,839][00034] Heartbeat connected on InferenceWorker_p0-w0 [2024-08-05 05:44:00,841][00034] Heartbeat connected on RolloutWorker_w0 [2024-08-05 05:44:05,483][00034] Fps is (10 sec: 819.2, 60 sec: 546.1, 300 sec: 546.1). Total num frames: 8192. Throughput: 0: 218.0. Samples: 3651. Policy #0 lag: (min: 0.0, avg: 0.0, max: 0.0) [2024-08-05 05:44:05,861][00139] DAMAGECOUNT value on done: 15.0 [2024-08-05 05:44:06,112][00139] DAMAGECOUNT value on done: 17.0 [2024-08-05 05:44:06,113][00139] Sum rewards: -9.493, reward structure: {'DEATHCOUNT': '-8.250', 'FRAGCOUNT': '-3.000', 'HEALTH': '-1.004', 'weapon4': '0.004', 'AMMO2': '0.006', 'AMMO5': '0.009', 'HITCOUNT': '0.020', 'WEAPON1': '0.020', 'AMMO4': '0.032', 'ARMOR': '0.043', 'DAMAGECOUNT': '0.051', 'weapon5': '0.068', 'WEAPON4': '0.100', 'AMMO3': '0.124', 'WEAPON5': '0.200', 'weapon3': '0.534', 'WEAPON3': '0.550', 'weapon2': '1.000'} [2024-08-05 05:44:10,483][00034] Fps is (10 sec: 1638.4, 60 sec: 819.2, 300 sec: 819.2). Total num frames: 16384. Throughput: 0: 247.3. Samples: 5328. Policy #0 lag: (min: 0.0, avg: 0.3, max: 1.0) [2024-08-05 05:44:10,486][00034] Avg episode reward: [(0, '-8.071')] [2024-08-05 05:44:15,483][00034] Fps is (10 sec: 1638.4, 60 sec: 983.0, 300 sec: 983.0). Total num frames: 24576. Throughput: 0: 232.7. Samples: 6198. Policy #0 lag: (min: 0.0, avg: 0.1, max: 1.0) [2024-08-05 05:44:15,485][00034] Avg episode reward: [(0, '-8.071')] [2024-08-05 05:44:20,483][00034] Fps is (10 sec: 1638.4, 60 sec: 1092.3, 300 sec: 1092.3). Total num frames: 32768. Throughput: 0: 250.7. Samples: 7902. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 05:44:20,484][00034] Avg episode reward: [(0, '-8.071')] [2024-08-05 05:44:20,673][00139] DAMAGECOUNT value on done: 347.0 [2024-08-05 05:44:20,673][00139] Sum rewards: -4.333, reward structure: {'DEATHCOUNT': '-8.250', 'HEALTH': '-2.296', 'AMMO2': '0.012', 'AMMO4': '0.057', 'ARMOR': '0.068', 'WEAPON4': '0.150', 'AMMO3': '0.176', 'weapon4': '0.192', 'HITCOUNT': '0.200', 'weapon3': '0.650', 'weapon2': '0.762', 'WEAPON3': '0.950', 'DAMAGECOUNT': '0.996', 'FRAGCOUNT': '2.000'} [2024-08-05 05:44:20,896][00139] DAMAGECOUNT value on done: 17.0 [2024-08-05 05:44:25,483][00034] Fps is (10 sec: 819.2, 60 sec: 936.2, 300 sec: 936.2). Total num frames: 32768. Throughput: 0: 264.3. Samples: 9630. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 05:44:25,485][00034] Avg episode reward: [(0, '-6.886')] [2024-08-05 05:44:30,483][00034] Fps is (10 sec: 819.2, 60 sec: 1024.0, 300 sec: 1024.0). Total num frames: 40960. Throughput: 0: 253.0. Samples: 10503. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 05:44:30,484][00034] Avg episode reward: [(0, '-6.886')] [2024-08-05 05:44:35,149][00139] DAMAGECOUNT value on done: 347.0 [2024-08-05 05:44:35,386][00139] DAMAGECOUNT value on done: 17.0 [2024-08-05 05:44:35,483][00034] Fps is (10 sec: 1638.4, 60 sec: 1092.3, 300 sec: 1092.3). Total num frames: 49152. Throughput: 0: 263.9. Samples: 12255. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 05:44:35,484][00034] Avg episode reward: [(0, '-6.998')] [2024-08-05 05:44:40,483][00034] Fps is (10 sec: 819.2, 60 sec: 983.0, 300 sec: 983.0). Total num frames: 49152. Throughput: 0: 264.9. Samples: 13993. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 05:44:40,484][00034] Avg episode reward: [(0, '-6.998')] [2024-08-05 05:44:45,483][00034] Fps is (10 sec: 819.2, 60 sec: 1042.6, 300 sec: 1042.6). Total num frames: 57344. Throughput: 0: 273.3. Samples: 14853. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 05:44:45,485][00034] Avg episode reward: [(0, '-6.998')] [2024-08-05 05:44:49,783][00139] DAMAGECOUNT value on done: 447.0 [2024-08-05 05:44:50,005][00139] DAMAGECOUNT value on done: 157.0 [2024-08-05 05:44:50,006][00139] Sum rewards: -8.261, reward structure: {'DEATHCOUNT': '-11.250', 'HEALTH': '-1.292', 'AMMO2': '0.015', 'ARMOR': '0.056', 'AMMO4': '0.073', 'weapon4': '0.094', 'HITCOUNT': '0.120', 'AMMO3': '0.147', 'WEAPON4': '0.150', 'DAMAGECOUNT': '0.420', 'weapon3': '0.554', 'WEAPON3': '0.750', 'weapon2': '0.902', 'FRAGCOUNT': '1.000'} [2024-08-05 05:44:50,483][00034] Fps is (10 sec: 1638.4, 60 sec: 1092.3, 300 sec: 1092.3). Total num frames: 65536. Throughput: 0: 286.4. Samples: 16537. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 05:44:50,485][00034] Avg episode reward: [(0, '-7.276')] [2024-08-05 05:44:55,483][00034] Fps is (10 sec: 1638.4, 60 sec: 1228.8, 300 sec: 1134.3). Total num frames: 73728. Throughput: 0: 289.7. Samples: 18364. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 05:44:55,484][00034] Avg episode reward: [(0, '-7.276')] [2024-08-05 05:45:00,483][00034] Fps is (10 sec: 819.2, 60 sec: 1228.8, 300 sec: 1053.3). Total num frames: 73728. Throughput: 0: 288.9. Samples: 19197. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 05:45:00,484][00034] Avg episode reward: [(0, '-7.276')] [2024-08-05 05:45:02,576][00138] Updated weights for policy 0, policy_version 10 (0.0016) [2024-08-05 05:45:04,143][00139] DAMAGECOUNT value on done: 617.0 [2024-08-05 05:45:04,144][00139] Sum rewards: -4.048, reward structure: {'DEATHCOUNT': '-8.250', 'HEALTH': '-1.184', 'AMMO2': '0.000', 'AMMO4': '0.002', 'ARMOR': '0.081', 'HITCOUNT': '0.100', 'AMMO3': '0.139', 'DAMAGECOUNT': '0.510', 'WEAPON3': '0.750', 'weapon3': '0.762', 'weapon2': '1.042', 'FRAGCOUNT': '2.000'} [2024-08-05 05:45:04,374][00139] DAMAGECOUNT value on done: 217.0 [2024-08-05 05:45:05,483][00034] Fps is (10 sec: 819.2, 60 sec: 1228.8, 300 sec: 1092.3). Total num frames: 81920. Throughput: 0: 289.6. Samples: 20933. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 05:45:05,484][00034] Avg episode reward: [(0, '-7.334')] [2024-08-05 05:45:10,483][00034] Fps is (10 sec: 1638.4, 60 sec: 1228.8, 300 sec: 1126.4). Total num frames: 90112. Throughput: 0: 289.4. Samples: 22652. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 05:45:10,485][00034] Avg episode reward: [(0, '-7.334')] [2024-08-05 05:45:15,483][00034] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1060.1). Total num frames: 90112. Throughput: 0: 289.4. Samples: 23527. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 05:45:15,485][00034] Avg episode reward: [(0, '-7.334')] [2024-08-05 05:45:18,865][00139] DAMAGECOUNT value on done: 639.0 [2024-08-05 05:45:19,099][00139] DAMAGECOUNT value on done: 284.0 [2024-08-05 05:45:19,100][00139] Sum rewards: -5.957, reward structure: {'DEATHCOUNT': '-9.000', 'HEALTH': '-1.072', 'AMMO2': '0.014', 'AMMO4': '0.067', 'HITCOUNT': '0.070', 'ARMOR': '0.088', 'AMMO3': '0.119', 'weapon4': '0.128', 'WEAPON4': '0.150', 'DAMAGECOUNT': '0.201', 'WEAPON3': '0.650', 'weapon3': '0.786', 'weapon2': '0.842', 'FRAGCOUNT': '1.000'} [2024-08-05 05:45:20,483][00034] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1092.3). Total num frames: 98304. Throughput: 0: 287.8. Samples: 25205. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 05:45:20,486][00034] Avg episode reward: [(0, '-7.045')] [2024-08-05 05:45:25,483][00034] Fps is (10 sec: 1638.4, 60 sec: 1228.8, 300 sec: 1121.0). Total num frames: 106496. Throughput: 0: 286.6. Samples: 26890. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 05:45:25,484][00034] Avg episode reward: [(0, '-7.045')] [2024-08-05 05:45:25,487][00132] Saving new best policy, reward=-7.045! [2024-08-05 05:45:28,905][00139] Large shaping reward -2.549 for [('FRAGCOUNT', -1.5, -1.0), ('DEATHCOUNT', -0.75, 1.0), ('HEALTH', -0.3, -100.0), ('AMMO5', -0.0005, -1.0), ('weapon5', 0.002)] [2024-08-05 05:45:30,483][00034] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1065.0). Total num frames: 106496. Throughput: 0: 287.6. Samples: 27797. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 05:45:30,484][00034] Avg episode reward: [(0, '-7.045')] [2024-08-05 05:45:33,408][00139] DAMAGECOUNT value on done: 659.0 [2024-08-05 05:45:33,408][00139] Sum rewards: -14.386, reward structure: {'DEATHCOUNT': '-13.500', 'HEALTH': '-2.148', 'FRAGCOUNT': '-1.500', 'AMMO5': '0.011', 'AMMO2': '0.018', 'HITCOUNT': '0.020', 'ARMOR': '0.040', 'DAMAGECOUNT': '0.060', 'AMMO4': '0.087', 'AMMO3': '0.104', 'weapon5': '0.104', 'weapon4': '0.124', 'WEAPON4': '0.150', 'weapon3': '0.246', 'WEAPON5': '0.250', 'WEAPON3': '0.550', 'weapon2': '0.998'} [2024-08-05 05:45:33,639][00139] DAMAGECOUNT value on done: 344.0 [2024-08-05 05:45:33,640][00139] Sum rewards: -8.261, reward structure: {'DEATHCOUNT': '-11.250', 'HEALTH': '-1.052', 'AMMO2': '0.013', 'HITCOUNT': '0.050', 'AMMO4': '0.067', 'WEAPON4': '0.100', 'AMMO3': '0.104', 'DAMAGECOUNT': '0.180', 'weapon3': '0.456', 'ARMOR': '0.471', 'WEAPON3': '0.500', 'FRAGCOUNT': '1.000', 'weapon2': '1.100'} [2024-08-05 05:45:35,483][00034] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1092.3). Total num frames: 114688. Throughput: 0: 289.0. Samples: 29540. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 05:45:35,485][00034] Avg episode reward: [(0, '-7.657')] [2024-08-05 05:45:35,489][00132] Saving /kaggle/working/train_dir/default_experiment/checkpoint_p0/checkpoint_000000014_114688.pth... [2024-08-05 05:45:40,483][00034] Fps is (10 sec: 1638.4, 60 sec: 1228.8, 300 sec: 1117.1). Total num frames: 122880. Throughput: 0: 287.0. Samples: 31281. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 05:45:40,484][00034] Avg episode reward: [(0, '-7.657')] [2024-08-05 05:45:45,483][00034] Fps is (10 sec: 1638.4, 60 sec: 1228.8, 300 sec: 1139.8). Total num frames: 131072. Throughput: 0: 288.4. Samples: 32173. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 05:45:45,484][00034] Avg episode reward: [(0, '-7.657')] [2024-08-05 05:45:47,574][00139] DAMAGECOUNT value on done: 664.0 [2024-08-05 05:45:47,780][00139] DAMAGECOUNT value on done: 414.0 [2024-08-05 05:45:47,780][00139] Sum rewards: -3.914, reward structure: {'DEATHCOUNT': '-6.750', 'HEALTH': '-1.061', 'AMMO5': '0.005', 'AMMO2': '0.006', 'AMMO4': '0.029', 'ARMOR': '0.040', 'weapon4': '0.068', 'HITCOUNT': '0.070', 'WEAPON4': '0.100', 'AMMO3': '0.115', 'DAMAGECOUNT': '0.210', 'WEAPON3': '0.550', 'weapon3': '0.798', 'weapon2': '0.906', 'FRAGCOUNT': '1.000'} [2024-08-05 05:45:50,483][00034] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1092.3). Total num frames: 131072. Throughput: 0: 289.3. Samples: 33951. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 05:45:50,484][00034] Avg episode reward: [(0, '-7.259')] [2024-08-05 05:45:55,483][00034] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1114.1). Total num frames: 139264. Throughput: 0: 289.8. Samples: 35695. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 05:45:55,484][00034] Avg episode reward: [(0, '-7.259')] [2024-08-05 05:46:00,483][00034] Fps is (10 sec: 1638.4, 60 sec: 1228.8, 300 sec: 1134.3). Total num frames: 147456. Throughput: 0: 291.0. Samples: 36621. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 05:46:00,485][00034] Avg episode reward: [(0, '-7.259')] [2024-08-05 05:46:02,027][00139] DAMAGECOUNT value on done: 664.0 [2024-08-05 05:46:02,261][00139] DAMAGECOUNT value on done: 484.0 [2024-08-05 05:46:05,483][00034] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1092.3). Total num frames: 147456. Throughput: 0: 292.1. Samples: 38348. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 05:46:05,484][00034] Avg episode reward: [(0, '-7.328')] [2024-08-05 05:46:10,483][00034] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1111.8). Total num frames: 155648. Throughput: 0: 293.8. Samples: 40111. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 05:46:10,484][00034] Avg episode reward: [(0, '-7.328')] [2024-08-05 05:46:13,163][00138] Updated weights for policy 0, policy_version 20 (0.0018) [2024-08-05 05:46:15,483][00034] Fps is (10 sec: 1638.4, 60 sec: 1228.8, 300 sec: 1129.9). Total num frames: 163840. Throughput: 0: 292.7. Samples: 40970. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 05:46:15,485][00034] Avg episode reward: [(0, '-7.328')] [2024-08-05 05:46:16,498][00139] DAMAGECOUNT value on done: 672.0 [2024-08-05 05:46:16,736][00139] DAMAGECOUNT value on done: 604.0 [2024-08-05 05:46:16,736][00139] Sum rewards: -8.416, reward structure: {'DEATHCOUNT': '-9.000', 'FRAGCOUNT': '-1.500', 'HEALTH': '-0.732', 'AMMO5': '0.005', 'AMMO2': '0.008', 'HITCOUNT': '0.030', 'weapon5': '0.036', 'AMMO4': '0.039', 'WEAPON5': '0.100', 'AMMO3': '0.116', 'DAMAGECOUNT': '0.360', 'WEAPON3': '0.600', 'weapon3': '0.644', 'weapon2': '0.878'} [2024-08-05 05:46:20,483][00034] Fps is (10 sec: 1638.4, 60 sec: 1228.8, 300 sec: 1146.9). Total num frames: 172032. Throughput: 0: 292.2. Samples: 42689. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 05:46:20,485][00034] Avg episode reward: [(0, '-7.366')] [2024-08-05 05:46:25,483][00034] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1109.9). Total num frames: 172032. Throughput: 0: 291.2. Samples: 44387. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 05:46:25,484][00034] Avg episode reward: [(0, '-7.366')] [2024-08-05 05:46:30,483][00034] Fps is (10 sec: 819.2, 60 sec: 1228.8, 300 sec: 1126.4). Total num frames: 180224. Throughput: 0: 290.9. Samples: 45262. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 05:46:30,484][00034] Avg episode reward: [(0, '-7.366')] [2024-08-05 05:46:31,166][00139] DAMAGECOUNT value on done: 716.0 [2024-08-05 05:46:31,392][00139] DAMAGECOUNT value on done: 609.0 [2024-08-05 05:46:35,483][00034] Fps is (10 sec: 1638.4, 60 sec: 1228.8, 300 sec: 1141.9). Total num frames: 188416. Throughput: 0: 289.8. Samples: 46992. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 05:46:35,484][00034] Avg episode reward: [(0, '-7.409')] [2024-08-05 05:46:40,483][00034] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1108.3). Total num frames: 188416. Throughput: 0: 289.3. Samples: 48714. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 05:46:40,484][00034] Avg episode reward: [(0, '-7.409')] [2024-08-05 05:46:45,483][00034] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1123.5). Total num frames: 196608. Throughput: 0: 287.5. Samples: 49560. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 05:46:45,485][00034] Avg episode reward: [(0, '-7.409')] [2024-08-05 05:46:45,727][00139] DAMAGECOUNT value on done: 736.0 [2024-08-05 05:46:45,728][00139] Sum rewards: -9.448, reward structure: {'DEATHCOUNT': '-10.500', 'HEALTH': '-2.665', 'AMMO2': '0.011', 'HITCOUNT': '0.030', 'ARMOR': '0.056', 'AMMO4': '0.056', 'DAMAGECOUNT': '0.060', 'weapon4': '0.108', 'AMMO3': '0.141', 'WEAPON4': '0.150', 'weapon3': '0.494', 'WEAPON3': '0.700', 'weapon2': '0.910', 'FRAGCOUNT': '1.000'} [2024-08-05 05:46:45,946][00139] DAMAGECOUNT value on done: 669.0 [2024-08-05 05:46:45,946][00139] Sum rewards: -8.693, reward structure: {'DEATHCOUNT': '-7.500', 'HEALTH': '-2.560', 'FRAGCOUNT': '-1.500', 'AMMO4': '-0.023', 'AMMO2': '-0.005', 'AMMO5': '0.005', 'weapon5': '0.012', 'ARMOR': '0.040', 'HITCOUNT': '0.050', 'WEAPON5': '0.100', 'AMMO3': '0.116', 'DAMAGECOUNT': '0.180', 'WEAPON3': '0.600', 'weapon2': '0.858', 'weapon3': '0.934'} [2024-08-05 05:46:50,483][00034] Fps is (10 sec: 1638.4, 60 sec: 1228.8, 300 sec: 1137.8). Total num frames: 204800. Throughput: 0: 288.2. Samples: 51315. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 05:46:50,484][00034] Avg episode reward: [(0, '-7.548')] [2024-08-05 05:46:55,483][00034] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1107.0). Total num frames: 204800. Throughput: 0: 287.6. Samples: 53051. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 05:46:55,484][00034] Avg episode reward: [(0, '-7.548')] [2024-08-05 05:47:00,212][00139] DAMAGECOUNT value on done: 766.0 [2024-08-05 05:47:00,417][00139] DAMAGECOUNT value on done: 718.0 [2024-08-05 05:47:00,483][00034] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1121.0). Total num frames: 212992. Throughput: 0: 287.8. Samples: 53922. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 05:47:00,484][00034] Avg episode reward: [(0, '-7.592')] [2024-08-05 05:47:05,483][00034] Fps is (10 sec: 1638.4, 60 sec: 1228.8, 300 sec: 1134.3). Total num frames: 221184. Throughput: 0: 288.7. Samples: 55681. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 05:47:05,484][00034] Avg episode reward: [(0, '-7.592')] [2024-08-05 05:47:10,483][00034] Fps is (10 sec: 1638.4, 60 sec: 1228.8, 300 sec: 1146.9). Total num frames: 229376. Throughput: 0: 289.4. Samples: 57411. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 05:47:10,485][00034] Avg episode reward: [(0, '-7.592')] [2024-08-05 05:47:14,635][00139] DAMAGECOUNT value on done: 786.0 [2024-08-05 05:47:14,635][00139] Sum rewards: -9.311, reward structure: {'DEATHCOUNT': '-12.000', 'HEALTH': '-1.664', 'AMMO2': '0.008', 'HITCOUNT': '0.030', 'ARMOR': '0.036', 'AMMO4': '0.040', 'DAMAGECOUNT': '0.060', 'WEAPON4': '0.100', 'weapon4': '0.104', 'AMMO3': '0.205', 'weapon2': '0.794', 'weapon3': '0.926', 'FRAGCOUNT': '1.000', 'WEAPON3': '1.050'} [2024-08-05 05:47:14,879][00139] DAMAGECOUNT value on done: 753.0 [2024-08-05 05:47:15,483][00034] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1118.9). Total num frames: 229376. Throughput: 0: 289.8. Samples: 58304. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 05:47:15,485][00034] Avg episode reward: [(0, '-7.585')] [2024-08-05 05:47:20,483][00034] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1131.3). Total num frames: 237568. Throughput: 0: 289.9. Samples: 60038. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 05:47:20,487][00034] Avg episode reward: [(0, '-7.585')] [2024-08-05 05:47:24,087][00138] Updated weights for policy 0, policy_version 30 (0.0018) [2024-08-05 05:47:25,483][00034] Fps is (10 sec: 1638.4, 60 sec: 1228.8, 300 sec: 1143.1). Total num frames: 245760. Throughput: 0: 289.8. Samples: 61753. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 05:47:25,484][00034] Avg episode reward: [(0, '-7.585')] [2024-08-05 05:47:29,222][00139] DAMAGECOUNT value on done: 791.0 [2024-08-05 05:47:29,449][00139] DAMAGECOUNT value on done: 868.0 [2024-08-05 05:47:29,449][00139] Sum rewards: -3.610, reward structure: {'DEATHCOUNT': '-7.500', 'HEALTH': '-0.200', 'AMMO5': '0.003', 'weapon4': '0.018', 'WEAPON1': '0.020', 'AMMO2': '0.024', 'WEAPON5': '0.050', 'HITCOUNT': '0.060', 'ARMOR': '0.064', 'weapon5': '0.074', 'AMMO3': '0.095', 'AMMO4': '0.119', 'WEAPON4': '0.150', 'DAMAGECOUNT': '0.345', 'WEAPON3': '0.500', 'weapon2': '0.750', 'weapon3': '0.818', 'FRAGCOUNT': '1.000'} [2024-08-05 05:47:30,483][00034] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1117.1). Total num frames: 245760. Throughput: 0: 290.5. Samples: 62631. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 05:47:30,485][00034] Avg episode reward: [(0, '-7.466')] [2024-08-05 05:47:35,483][00034] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1128.7). Total num frames: 253952. Throughput: 0: 289.8. Samples: 64355. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 05:47:35,484][00034] Avg episode reward: [(0, '-7.466')] [2024-08-05 05:47:35,490][00132] Saving /kaggle/working/train_dir/default_experiment/checkpoint_p0/checkpoint_000000031_253952.pth... [2024-08-05 05:47:40,483][00034] Fps is (10 sec: 1638.4, 60 sec: 1228.8, 300 sec: 1139.8). Total num frames: 262144. Throughput: 0: 290.4. Samples: 66118. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 05:47:40,484][00034] Avg episode reward: [(0, '-7.466')] [2024-08-05 05:47:43,603][00139] DAMAGECOUNT value on done: 801.0 [2024-08-05 05:47:43,825][00139] DAMAGECOUNT value on done: 898.0 [2024-08-05 05:47:45,483][00034] Fps is (10 sec: 1638.4, 60 sec: 1228.8, 300 sec: 1150.4). Total num frames: 270336. Throughput: 0: 290.7. Samples: 67003. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 05:47:45,484][00034] Avg episode reward: [(0, '-7.460')] [2024-08-05 05:47:50,483][00034] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1126.4). Total num frames: 270336. Throughput: 0: 290.2. Samples: 68740. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 05:47:50,487][00034] Avg episode reward: [(0, '-7.460')] [2024-08-05 05:47:55,483][00034] Fps is (10 sec: 819.2, 60 sec: 1228.8, 300 sec: 1136.8). Total num frames: 278528. Throughput: 0: 290.9. Samples: 70501. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 05:47:55,486][00034] Avg episode reward: [(0, '-7.460')] [2024-08-05 05:47:58,060][00139] DAMAGECOUNT value on done: 936.0 [2024-08-05 05:47:58,060][00139] Sum rewards: -6.209, reward structure: {'DEATHCOUNT': '-9.750', 'HEALTH': '-1.842', 'AMMO2': '0.013', 'ARMOR': '0.040', 'AMMO4': '0.066', 'HITCOUNT': '0.080', 'WEAPON4': '0.150', 'AMMO3': '0.158', 'weapon4': '0.268', 'DAMAGECOUNT': '0.405', 'weapon3': '0.590', 'weapon2': '0.762', 'WEAPON3': '0.850', 'FRAGCOUNT': '2.000'} [2024-08-05 05:47:58,292][00139] DAMAGECOUNT value on done: 943.0 [2024-08-05 05:47:58,292][00139] Sum rewards: -4.585, reward structure: {'DEATHCOUNT': '-8.250', 'AMMO5': '0.005', 'AMMO2': '0.013', 'ARMOR': '0.040', 'weapon5': '0.042', 'HITCOUNT': '0.060', 'AMMO4': '0.066', 'HEALTH': '0.090', 'AMMO3': '0.096', 'WEAPON5': '0.100', 'DAMAGECOUNT': '0.135', 'WEAPON4': '0.150', 'weapon4': '0.164', 'WEAPON3': '0.400', 'weapon3': '0.514', 'weapon2': '0.790', 'FRAGCOUNT': '1.000'} [2024-08-05 05:48:00,483][00034] Fps is (10 sec: 1638.4, 60 sec: 1228.8, 300 sec: 1146.9). Total num frames: 286720. Throughput: 0: 290.0. Samples: 71355. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 05:48:00,484][00034] Avg episode reward: [(0, '-7.339')] [2024-08-05 05:48:05,483][00034] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1124.4). Total num frames: 286720. Throughput: 0: 290.7. Samples: 73118. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 05:48:05,485][00034] Avg episode reward: [(0, '-7.339')] [2024-08-05 05:48:10,483][00034] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1134.3). Total num frames: 294912. Throughput: 0: 289.8. Samples: 74794. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 05:48:10,484][00034] Avg episode reward: [(0, '-7.339')] [2024-08-05 05:48:12,680][00139] DAMAGECOUNT value on done: 1071.0 [2024-08-05 05:48:12,681][00139] Sum rewards: -7.780, reward structure: {'DEATHCOUNT': '-11.250', 'HEALTH': '-1.710', 'AMMO5': '0.005', 'ARMOR': '0.008', 'weapon5': '0.014', 'AMMO2': '0.029', 'WEAPON5': '0.100', 'HITCOUNT': '0.110', 'AMMO3': '0.124', 'weapon4': '0.132', 'AMMO4': '0.143', 'WEAPON4': '0.200', 'DAMAGECOUNT': '0.405', 'weapon3': '0.408', 'WEAPON3': '0.600', 'weapon2': '0.902', 'FRAGCOUNT': '2.000'} [2024-08-05 05:48:12,908][00139] DAMAGECOUNT value on done: 993.0 [2024-08-05 05:48:15,483][00034] Fps is (10 sec: 1638.4, 60 sec: 1228.8, 300 sec: 1143.8). Total num frames: 303104. Throughput: 0: 289.6. Samples: 75663. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 05:48:15,484][00034] Avg episode reward: [(0, '-7.260')] [2024-08-05 05:48:20,483][00034] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1122.6). Total num frames: 303104. Throughput: 0: 290.8. Samples: 77443. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 05:48:20,484][00034] Avg episode reward: [(0, '-7.260')] [2024-08-05 05:48:25,483][00034] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1132.0). Total num frames: 311296. Throughput: 0: 290.6. Samples: 79193. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 05:48:25,484][00034] Avg episode reward: [(0, '-7.260')] [2024-08-05 05:48:27,137][00139] DAMAGECOUNT value on done: 1081.0 [2024-08-05 05:48:27,377][00139] DAMAGECOUNT value on done: 1008.0 [2024-08-05 05:48:30,483][00034] Fps is (10 sec: 1638.4, 60 sec: 1228.8, 300 sec: 1141.0). Total num frames: 319488. Throughput: 0: 288.8. Samples: 80000. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 05:48:30,485][00034] Avg episode reward: [(0, '-7.293')] [2024-08-05 05:48:34,796][00138] Updated weights for policy 0, policy_version 40 (0.0018) [2024-08-05 05:48:35,483][00034] Fps is (10 sec: 1638.4, 60 sec: 1228.8, 300 sec: 1149.8). Total num frames: 327680. Throughput: 0: 289.1. Samples: 81749. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 05:48:35,485][00034] Avg episode reward: [(0, '-7.293')] [2024-08-05 05:48:40,483][00034] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1129.9). Total num frames: 327680. Throughput: 0: 288.9. Samples: 83501. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 05:48:40,485][00034] Avg episode reward: [(0, '-7.293')] [2024-08-05 05:48:41,579][00139] DAMAGECOUNT value on done: 1111.0 [2024-08-05 05:48:41,579][00139] Sum rewards: -5.993, reward structure: {'DEATHCOUNT': '-8.250', 'HEALTH': '-2.060', 'AMMO2': '0.008', 'AMMO5': '0.020', 'weapon5': '0.024', 'weapon4': '0.026', 'HITCOUNT': '0.030', 'ARMOR': '0.032', 'AMMO4': '0.042', 'DAMAGECOUNT': '0.090', 'WEAPON4': '0.100', 'AMMO3': '0.165', 'WEAPON5': '0.300', 'weapon2': '0.724', 'weapon3': '0.856', 'WEAPON3': '0.900', 'FRAGCOUNT': '1.000'} [2024-08-05 05:48:41,865][00139] DAMAGECOUNT value on done: 1073.0 [2024-08-05 05:48:45,483][00034] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1138.5). Total num frames: 335872. Throughput: 0: 289.3. Samples: 84373. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 05:48:45,485][00034] Avg episode reward: [(0, '-7.378')] [2024-08-05 05:48:50,483][00034] Fps is (10 sec: 1638.4, 60 sec: 1228.8, 300 sec: 1166.3). Total num frames: 344064. Throughput: 0: 288.5. Samples: 86100. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 05:48:50,484][00034] Avg episode reward: [(0, '-7.378')] [2024-08-05 05:48:55,483][00034] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1166.3). Total num frames: 344064. Throughput: 0: 290.2. Samples: 87853. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 05:48:55,485][00034] Avg episode reward: [(0, '-7.378')] [2024-08-05 05:48:56,163][00139] DAMAGECOUNT value on done: 1116.0 [2024-08-05 05:48:56,389][00139] DAMAGECOUNT value on done: 1255.0 [2024-08-05 05:48:56,389][00139] Sum rewards: -6.691, reward structure: {'DEATHCOUNT': '-9.000', 'HEALTH': '-2.420', 'AMMO5': '0.010', 'AMMO2': '0.016', 'WEAPON1': '0.020', 'weapon5': '0.026', 'weapon4': '0.076', 'AMMO4': '0.078', 'WEAPON4': '0.100', 'HITCOUNT': '0.110', 'AMMO3': '0.135', 'WEAPON5': '0.150', 'DAMAGECOUNT': '0.546', 'WEAPON3': '0.750', 'weapon2': '0.834', 'weapon3': '0.878', 'FRAGCOUNT': '1.000'} [2024-08-05 05:49:00,483][00034] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1166.3). Total num frames: 352256. Throughput: 0: 289.4. Samples: 88684. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 05:49:00,484][00034] Avg episode reward: [(0, '-7.306')] [2024-08-05 05:49:05,483][00034] Fps is (10 sec: 1638.4, 60 sec: 1228.8, 300 sec: 1166.3). Total num frames: 360448. Throughput: 0: 287.8. Samples: 90393. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 05:49:05,485][00034] Avg episode reward: [(0, '-7.306')] [2024-08-05 05:49:10,483][00034] Fps is (10 sec: 1638.4, 60 sec: 1228.8, 300 sec: 1166.3). Total num frames: 368640. Throughput: 0: 287.8. Samples: 92145. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 05:49:10,485][00034] Avg episode reward: [(0, '-7.306')] [2024-08-05 05:49:10,855][00139] DAMAGECOUNT value on done: 1151.0 [2024-08-05 05:49:11,084][00139] DAMAGECOUNT value on done: 1260.0 [2024-08-05 05:49:15,483][00034] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1138.5). Total num frames: 368640. Throughput: 0: 289.0. Samples: 93005. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 05:49:15,484][00034] Avg episode reward: [(0, '-7.287')] [2024-08-05 05:49:20,483][00034] Fps is (10 sec: 819.2, 60 sec: 1228.8, 300 sec: 1166.3). Total num frames: 376832. Throughput: 0: 290.4. Samples: 94816. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 05:49:20,484][00034] Avg episode reward: [(0, '-7.287')] [2024-08-05 05:49:25,057][00139] DAMAGECOUNT value on done: 1161.0 [2024-08-05 05:49:25,262][00139] DAMAGECOUNT value on done: 1380.0 [2024-08-05 05:49:25,263][00139] Sum rewards: -8.800, reward structure: {'DEATHCOUNT': '-11.250', 'HEALTH': '-1.984', 'weapon5': '0.002', 'AMMO5': '0.003', 'AMMO2': '0.018', 'WEAPON1': '0.020', 'weapon4': '0.030', 'WEAPON5': '0.050', 'HITCOUNT': '0.080', 'ARMOR': '0.080', 'AMMO4': '0.091', 'AMMO3': '0.126', 'WEAPON4': '0.200', 'DAMAGECOUNT': '0.360', 'weapon3': '0.586', 'WEAPON3': '0.650', 'FRAGCOUNT': '1.000', 'weapon2': '1.138'} [2024-08-05 05:49:25,483][00034] Fps is (10 sec: 1638.4, 60 sec: 1228.8, 300 sec: 1166.3). Total num frames: 385024. Throughput: 0: 290.1. Samples: 96556. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 05:49:25,484][00034] Avg episode reward: [(0, '-7.308')] [2024-08-05 05:49:30,483][00034] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1138.5). Total num frames: 385024. Throughput: 0: 289.6. Samples: 97404. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 05:49:30,486][00034] Avg episode reward: [(0, '-7.308')] [2024-08-05 05:49:35,483][00034] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1166.3). Total num frames: 393216. Throughput: 0: 289.1. Samples: 99110. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 05:49:35,485][00034] Avg episode reward: [(0, '-7.308')] [2024-08-05 05:49:35,492][00132] Saving /kaggle/working/train_dir/default_experiment/checkpoint_p0/checkpoint_000000048_393216.pth... [2024-08-05 05:49:35,567][00132] Removing /kaggle/working/train_dir/default_experiment/checkpoint_p0/checkpoint_000000014_114688.pth [2024-08-05 05:49:39,724][00139] DAMAGECOUNT value on done: 1248.0 [2024-08-05 05:49:39,948][00139] DAMAGECOUNT value on done: 1385.0 [2024-08-05 05:49:40,483][00034] Fps is (10 sec: 1638.4, 60 sec: 1228.8, 300 sec: 1166.3). Total num frames: 401408. Throughput: 0: 288.9. Samples: 100854. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 05:49:40,484][00034] Avg episode reward: [(0, '-7.267')] [2024-08-05 05:49:45,355][00138] Updated weights for policy 0, policy_version 50 (0.0021) [2024-08-05 05:49:45,483][00034] Fps is (10 sec: 1638.4, 60 sec: 1228.8, 300 sec: 1166.3). Total num frames: 409600. Throughput: 0: 290.2. Samples: 101743. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 05:49:45,484][00034] Avg episode reward: [(0, '-7.267')] [2024-08-05 05:49:50,483][00034] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1138.5). Total num frames: 409600. Throughput: 0: 291.6. Samples: 103515. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 05:49:50,485][00034] Avg episode reward: [(0, '-7.267')] [2024-08-05 05:49:54,063][00139] DAMAGECOUNT value on done: 1288.0 [2024-08-05 05:49:54,293][00139] DAMAGECOUNT value on done: 1430.0 [2024-08-05 05:49:55,483][00034] Fps is (10 sec: 819.2, 60 sec: 1228.8, 300 sec: 1166.3). Total num frames: 417792. Throughput: 0: 291.1. Samples: 105244. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 05:49:55,484][00034] Avg episode reward: [(0, '-7.193')] [2024-08-05 05:50:00,483][00034] Fps is (10 sec: 1638.4, 60 sec: 1228.8, 300 sec: 1166.3). Total num frames: 425984. Throughput: 0: 291.4. Samples: 106116. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 05:50:00,484][00034] Avg episode reward: [(0, '-7.193')] [2024-08-05 05:50:05,483][00034] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1138.5). Total num frames: 425984. Throughput: 0: 289.2. Samples: 107830. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 05:50:05,484][00034] Avg episode reward: [(0, '-7.193')] [2024-08-05 05:50:08,510][00139] DAMAGECOUNT value on done: 1313.0 [2024-08-05 05:50:08,754][00139] DAMAGECOUNT value on done: 1660.0 [2024-08-05 05:50:10,483][00034] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1166.3). Total num frames: 434176. Throughput: 0: 289.8. Samples: 109595. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 05:50:10,485][00034] Avg episode reward: [(0, '-7.250')] [2024-08-05 05:50:15,483][00034] Fps is (10 sec: 1638.4, 60 sec: 1228.8, 300 sec: 1166.3). Total num frames: 442368. Throughput: 0: 291.4. Samples: 110516. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 05:50:15,484][00034] Avg episode reward: [(0, '-7.250')] [2024-08-05 05:50:20,483][00034] Fps is (10 sec: 1638.4, 60 sec: 1228.8, 300 sec: 1166.3). Total num frames: 450560. Throughput: 0: 293.3. Samples: 112308. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 05:50:20,485][00034] Avg episode reward: [(0, '-7.250')] [2024-08-05 05:50:22,743][00139] DAMAGECOUNT value on done: 1523.0 [2024-08-05 05:50:22,743][00139] Sum rewards: -3.503, reward structure: {'DEATHCOUNT': '-7.500', 'HEALTH': '-0.729', 'AMMO4': '-0.001', 'AMMO2': '-0.000', 'AMMO5': '0.015', 'weapon5': '0.060', 'HITCOUNT': '0.100', 'WEAPON5': '0.150', 'AMMO3': '0.178', 'DAMAGECOUNT': '0.630', 'WEAPON3': '0.650', 'weapon3': '0.866', 'FRAGCOUNT': '1.000', 'weapon2': '1.078'} [2024-08-05 05:50:22,965][00139] DAMAGECOUNT value on done: 1705.0 [2024-08-05 05:50:25,483][00034] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1166.3). Total num frames: 450560. Throughput: 0: 293.4. Samples: 114058. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 05:50:25,485][00034] Avg episode reward: [(0, '-7.147')] [2024-08-05 05:50:30,483][00034] Fps is (10 sec: 819.2, 60 sec: 1228.8, 300 sec: 1166.3). Total num frames: 458752. Throughput: 0: 293.6. Samples: 114956. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 05:50:30,485][00034] Avg episode reward: [(0, '-7.147')] [2024-08-05 05:50:35,483][00034] Fps is (10 sec: 1638.4, 60 sec: 1228.8, 300 sec: 1166.3). Total num frames: 466944. Throughput: 0: 292.5. Samples: 116678. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 05:50:35,484][00034] Avg episode reward: [(0, '-7.147')] [2024-08-05 05:50:37,167][00139] DAMAGECOUNT value on done: 1533.0 [2024-08-05 05:50:37,400][00139] DAMAGECOUNT value on done: 1910.0 [2024-08-05 05:50:37,401][00139] Sum rewards: -5.137, reward structure: {'DEATHCOUNT': '-9.750', 'HEALTH': '-0.572', 'AMMO5': '0.004', 'AMMO2': '0.024', 'weapon4': '0.040', 'weapon5': '0.042', 'HITCOUNT': '0.080', 'WEAPON5': '0.100', 'AMMO3': '0.120', 'AMMO4': '0.121', 'WEAPON4': '0.250', 'ARMOR': '0.479', 'DAMAGECOUNT': '0.615', 'WEAPON3': '0.650', 'weapon3': '0.754', 'weapon2': '0.906', 'FRAGCOUNT': '1.000'} [2024-08-05 05:50:40,483][00034] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1138.5). Total num frames: 466944. Throughput: 0: 292.3. Samples: 118399. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 05:50:40,484][00034] Avg episode reward: [(0, '-7.160')] [2024-08-05 05:50:45,483][00034] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1166.3). Total num frames: 475136. Throughput: 0: 292.2. Samples: 119265. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 05:50:45,484][00034] Avg episode reward: [(0, '-7.160')] [2024-08-05 05:50:50,483][00034] Fps is (10 sec: 1638.4, 60 sec: 1228.8, 300 sec: 1166.3). Total num frames: 483328. Throughput: 0: 293.7. Samples: 121045. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 05:50:50,484][00034] Avg episode reward: [(0, '-7.160')] [2024-08-05 05:50:51,433][00139] DAMAGECOUNT value on done: 1538.0 [2024-08-05 05:50:51,665][00139] DAMAGECOUNT value on done: 1925.0 [2024-08-05 05:50:55,457][00138] Updated weights for policy 0, policy_version 60 (0.0017) [2024-08-05 05:50:55,483][00034] Fps is (10 sec: 1638.4, 60 sec: 1228.8, 300 sec: 1166.3). Total num frames: 491520. Throughput: 0: 293.6. Samples: 122808. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 05:50:55,484][00034] Avg episode reward: [(0, '-7.100')] [2024-08-05 05:51:00,483][00034] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1166.3). Total num frames: 491520. Throughput: 0: 293.1. Samples: 123705. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 05:51:00,484][00034] Avg episode reward: [(0, '-7.100')] [2024-08-05 05:51:05,483][00034] Fps is (10 sec: 819.2, 60 sec: 1228.8, 300 sec: 1166.3). Total num frames: 499712. Throughput: 0: 291.8. Samples: 125439. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 05:51:05,485][00034] Avg episode reward: [(0, '-7.100')] [2024-08-05 05:51:05,915][00139] DAMAGECOUNT value on done: 1548.0 [2024-08-05 05:51:06,155][00139] DAMAGECOUNT value on done: 2011.0 [2024-08-05 05:51:06,156][00139] Sum rewards: -6.741, reward structure: {'DEATHCOUNT': '-9.000', 'HEALTH': '-1.860', 'AMMO2': '0.019', 'ARMOR': '0.036', 'HITCOUNT': '0.050', 'weapon4': '0.054', 'AMMO4': '0.097', 'AMMO3': '0.107', 'WEAPON4': '0.150', 'DAMAGECOUNT': '0.258', 'WEAPON3': '0.450', 'weapon3': '0.634', 'FRAGCOUNT': '1.000', 'weapon2': '1.264'} [2024-08-05 05:51:10,483][00034] Fps is (10 sec: 1638.4, 60 sec: 1228.8, 300 sec: 1166.3). Total num frames: 507904. Throughput: 0: 290.9. Samples: 127149. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 05:51:10,484][00034] Avg episode reward: [(0, '-7.139')] [2024-08-05 05:51:15,483][00034] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1138.5). Total num frames: 507904. Throughput: 0: 290.8. Samples: 128044. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 05:51:15,484][00034] Avg episode reward: [(0, '-7.139')] [2024-08-05 05:51:20,271][00139] DAMAGECOUNT value on done: 1558.0 [2024-08-05 05:51:20,476][00139] DAMAGECOUNT value on done: 2041.0 [2024-08-05 05:51:20,483][00034] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1166.3). Total num frames: 516096. Throughput: 0: 291.8. Samples: 129810. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 05:51:20,484][00034] Avg episode reward: [(0, '-7.092')] [2024-08-05 05:51:25,483][00034] Fps is (10 sec: 1638.4, 60 sec: 1228.8, 300 sec: 1166.3). Total num frames: 524288. Throughput: 0: 292.4. Samples: 131555. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 05:51:25,484][00034] Avg episode reward: [(0, '-7.147')] [2024-08-05 05:51:30,483][00034] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1138.5). Total num frames: 524288. Throughput: 0: 292.7. Samples: 132437. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 05:51:30,484][00034] Avg episode reward: [(0, '-7.147')] [2024-08-05 05:51:34,758][00139] DAMAGECOUNT value on done: 1613.0 [2024-08-05 05:51:34,758][00139] Sum rewards: -7.792, reward structure: {'DEATHCOUNT': '-10.500', 'HEALTH': '-2.116', 'AMMO2': '0.038', 'ARMOR': '0.040', 'HITCOUNT': '0.040', 'AMMO3': '0.129', 'weapon4': '0.142', 'DAMAGECOUNT': '0.165', 'AMMO4': '0.192', 'WEAPON4': '0.450', 'weapon3': '0.610', 'WEAPON3': '0.700', 'FRAGCOUNT': '1.000', 'weapon2': '1.318'} [2024-08-05 05:51:34,991][00139] DAMAGECOUNT value on done: 2051.0 [2024-08-05 05:51:35,483][00034] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1166.3). Total num frames: 532480. Throughput: 0: 291.3. Samples: 134155. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 05:51:35,484][00034] Avg episode reward: [(0, '-7.154')] [2024-08-05 05:51:35,490][00132] Saving /kaggle/working/train_dir/default_experiment/checkpoint_p0/checkpoint_000000065_532480.pth... [2024-08-05 05:51:35,560][00132] Removing /kaggle/working/train_dir/default_experiment/checkpoint_p0/checkpoint_000000031_253952.pth [2024-08-05 05:51:40,483][00034] Fps is (10 sec: 1638.4, 60 sec: 1228.8, 300 sec: 1166.3). Total num frames: 540672. Throughput: 0: 289.5. Samples: 135837. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 05:51:40,484][00034] Avg episode reward: [(0, '-7.154')] [2024-08-05 05:51:45,483][00034] Fps is (10 sec: 1638.4, 60 sec: 1228.8, 300 sec: 1166.3). Total num frames: 548864. Throughput: 0: 289.9. Samples: 136751. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 05:51:45,484][00034] Avg episode reward: [(0, '-7.154')] [2024-08-05 05:51:49,436][00139] DAMAGECOUNT value on done: 1830.0 [2024-08-05 05:51:49,436][00139] Sum rewards: -4.109, reward structure: {'DEATHCOUNT': '-7.500', 'HEALTH': '-0.942', 'AMMO5': '0.005', 'AMMO2': '0.018', 'AMMO3': '0.059', 'weapon5': '0.074', 'weapon4': '0.078', 'HITCOUNT': '0.080', 'AMMO4': '0.088', 'WEAPON4': '0.100', 'WEAPON5': '0.100', 'WEAPON3': '0.350', 'DAMAGECOUNT': '0.651', 'weapon3': '0.720', 'FRAGCOUNT': '1.000', 'weapon2': '1.010'} [2024-08-05 05:51:49,686][00139] DAMAGECOUNT value on done: 2076.0 [2024-08-05 05:51:50,483][00034] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1166.3). Total num frames: 548864. Throughput: 0: 289.2. Samples: 138455. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 05:51:50,484][00034] Avg episode reward: [(0, '-7.133')] [2024-08-05 05:51:55,483][00034] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1166.3). Total num frames: 557056. Throughput: 0: 288.2. Samples: 140116. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 05:51:55,484][00034] Avg episode reward: [(0, '-7.133')] [2024-08-05 05:52:00,483][00034] Fps is (10 sec: 1638.4, 60 sec: 1228.8, 300 sec: 1166.3). Total num frames: 565248. Throughput: 0: 288.4. Samples: 141021. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 05:52:00,484][00034] Avg episode reward: [(0, '-7.133')] [2024-08-05 05:52:04,376][00139] DAMAGECOUNT value on done: 1855.0 [2024-08-05 05:52:04,609][00139] DAMAGECOUNT value on done: 2091.0 [2024-08-05 05:52:05,483][00034] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1138.5). Total num frames: 565248. Throughput: 0: 286.2. Samples: 142687. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 05:52:05,485][00034] Avg episode reward: [(0, '-7.164')] [2024-08-05 05:52:06,674][00138] Updated weights for policy 0, policy_version 70 (0.0018) [2024-08-05 05:52:10,483][00034] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1166.3). Total num frames: 573440. Throughput: 0: 282.6. Samples: 144270. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 05:52:10,484][00034] Avg episode reward: [(0, '-7.164')] [2024-08-05 05:52:15,483][00034] Fps is (10 sec: 1638.4, 60 sec: 1228.8, 300 sec: 1166.3). Total num frames: 581632. Throughput: 0: 282.4. Samples: 145146. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 05:52:15,485][00034] Avg episode reward: [(0, '-7.164')] [2024-08-05 05:52:19,495][00139] DAMAGECOUNT value on done: 1965.0 [2024-08-05 05:52:19,496][00139] Sum rewards: -4.914, reward structure: {'DEATHCOUNT': '-8.250', 'HEALTH': '-1.017', 'AMMO5': '0.003', 'AMMO2': '0.023', 'WEAPON5': '0.050', 'ARMOR': '0.076', 'HITCOUNT': '0.080', 'AMMO3': '0.110', 'AMMO4': '0.114', 'weapon4': '0.128', 'WEAPON4': '0.150', 'DAMAGECOUNT': '0.330', 'WEAPON3': '0.600', 'weapon3': '0.720', 'weapon2': '0.970', 'FRAGCOUNT': '1.000'} [2024-08-05 05:52:19,718][00139] DAMAGECOUNT value on done: 2101.0 [2024-08-05 05:52:20,483][00034] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1138.5). Total num frames: 581632. Throughput: 0: 281.5. Samples: 146822. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 05:52:20,484][00034] Avg episode reward: [(0, '-7.112')] [2024-08-05 05:52:25,483][00034] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1166.3). Total num frames: 589824. Throughput: 0: 282.5. Samples: 148548. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 05:52:25,485][00034] Avg episode reward: [(0, '-7.112')] [2024-08-05 05:52:30,483][00034] Fps is (10 sec: 1638.4, 60 sec: 1228.8, 300 sec: 1166.3). Total num frames: 598016. Throughput: 0: 281.9. Samples: 149436. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 05:52:30,485][00034] Avg episode reward: [(0, '-7.112')] [2024-08-05 05:52:33,932][00139] DAMAGECOUNT value on done: 1985.0 [2024-08-05 05:52:33,933][00139] Sum rewards: 0.680, reward structure: {'DEATHCOUNT': '-3.750', 'WEAPON1': '0.010', 'HITCOUNT': '0.020', 'AMMO2': '0.020', 'ARMOR': '0.040', 'AMMO3': '0.060', 'DAMAGECOUNT': '0.060', 'weapon4': '0.092', 'WEAPON4': '0.100', 'AMMO4': '0.102', 'WEAPON3': '0.250', 'weapon3': '0.588', 'HEALTH': '0.840', 'FRAGCOUNT': '1.000', 'weapon2': '1.248'} [2024-08-05 05:52:34,157][00139] DAMAGECOUNT value on done: 2111.0 [2024-08-05 05:52:35,483][00034] Fps is (10 sec: 1638.4, 60 sec: 1228.8, 300 sec: 1166.3). Total num frames: 606208. Throughput: 0: 283.0. Samples: 151192. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 05:52:35,484][00034] Avg episode reward: [(0, '-6.974')] [2024-08-05 05:52:35,490][00132] Saving new best policy, reward=-6.974! [2024-08-05 05:52:40,483][00034] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1138.5). Total num frames: 606208. Throughput: 0: 283.6. Samples: 152879. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 05:52:40,485][00034] Avg episode reward: [(0, '-6.974')] [2024-08-05 05:52:45,483][00034] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1166.3). Total num frames: 614400. Throughput: 0: 282.2. Samples: 153718. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 05:52:45,484][00034] Avg episode reward: [(0, '-6.974')] [2024-08-05 05:52:48,733][00139] DAMAGECOUNT value on done: 1995.0 [2024-08-05 05:52:49,002][00139] DAMAGECOUNT value on done: 2221.0 [2024-08-05 05:52:49,003][00139] Sum rewards: -7.856, reward structure: {'DEATHCOUNT': '-11.250', 'HEALTH': '-2.022', 'AMMO2': '0.003', 'AMMO5': '0.007', 'weapon5': '0.010', 'AMMO4': '0.013', 'weapon4': '0.024', 'ARMOR': '0.032', 'WEAPON4': '0.050', 'HITCOUNT': '0.080', 'WEAPON5': '0.100', 'AMMO3': '0.160', 'DAMAGECOUNT': '0.330', 'weapon3': '0.678', 'WEAPON3': '0.850', 'weapon2': '1.078', 'FRAGCOUNT': '2.000'} [2024-08-05 05:52:50,483][00034] Fps is (10 sec: 1638.4, 60 sec: 1228.8, 300 sec: 1166.3). Total num frames: 622592. Throughput: 0: 283.2. Samples: 155432. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 05:52:50,484][00034] Avg episode reward: [(0, '-6.922')] [2024-08-05 05:52:50,486][00132] Saving new best policy, reward=-6.922! [2024-08-05 05:52:55,483][00034] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1138.5). Total num frames: 622592. Throughput: 0: 286.9. Samples: 157180. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 05:52:55,484][00034] Avg episode reward: [(0, '-6.922')] [2024-08-05 05:53:00,483][00034] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1166.3). Total num frames: 630784. Throughput: 0: 286.7. Samples: 158049. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 05:53:00,484][00034] Avg episode reward: [(0, '-6.922')] [2024-08-05 05:53:03,169][00139] DAMAGECOUNT value on done: 2080.0 [2024-08-05 05:53:03,170][00139] Sum rewards: -9.047, reward structure: {'DEATHCOUNT': '-11.250', 'HEALTH': '-2.355', 'AMMO5': '0.003', 'AMMO2': '0.011', 'ARMOR': '0.044', 'AMMO4': '0.055', 'HITCOUNT': '0.090', 'weapon4': '0.128', 'WEAPON4': '0.150', 'AMMO3': '0.174', 'DAMAGECOUNT': '0.255', 'weapon3': '0.352', 'WEAPON3': '0.850', 'FRAGCOUNT': '1.000', 'weapon2': '1.446'} [2024-08-05 05:53:03,402][00139] DAMAGECOUNT value on done: 2381.0 [2024-08-05 05:53:03,403][00139] Sum rewards: -7.338, reward structure: {'DEATHCOUNT': '-10.500', 'HEALTH': '-1.542', 'AMMO2': '0.002', 'AMMO4': '0.010', 'ARMOR': '0.056', 'HITCOUNT': '0.090', 'AMMO3': '0.182', 'DAMAGECOUNT': '0.480', 'WEAPON3': '0.900', 'weapon3': '0.936', 'FRAGCOUNT': '1.000', 'weapon2': '1.048'} [2024-08-05 05:53:05,483][00034] Fps is (10 sec: 1638.4, 60 sec: 1228.8, 300 sec: 1166.3). Total num frames: 638976. Throughput: 0: 288.4. Samples: 159800. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 05:53:05,484][00034] Avg episode reward: [(0, '-6.955')] [2024-08-05 05:53:10,483][00034] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1138.5). Total num frames: 638976. Throughput: 0: 289.4. Samples: 161573. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 05:53:10,485][00034] Avg episode reward: [(0, '-6.955')] [2024-08-05 05:53:15,483][00034] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1166.3). Total num frames: 647168. Throughput: 0: 287.3. Samples: 162363. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 05:53:15,484][00034] Avg episode reward: [(0, '-6.955')] [2024-08-05 05:53:17,831][00139] DAMAGECOUNT value on done: 2080.0 [2024-08-05 05:53:18,076][00139] DAMAGECOUNT value on done: 2411.0 [2024-08-05 05:53:18,334][00138] Updated weights for policy 0, policy_version 80 (0.0016) [2024-08-05 05:53:20,483][00034] Fps is (10 sec: 1638.4, 60 sec: 1228.8, 300 sec: 1166.3). Total num frames: 655360. Throughput: 0: 286.0. Samples: 164062. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 05:53:20,484][00034] Avg episode reward: [(0, '-6.931')] [2024-08-05 05:53:25,483][00034] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1138.5). Total num frames: 655360. Throughput: 0: 286.6. Samples: 165775. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 05:53:25,484][00034] Avg episode reward: [(0, '-6.931')] [2024-08-05 05:53:30,483][00034] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1138.5). Total num frames: 663552. Throughput: 0: 286.4. Samples: 166607. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 05:53:30,485][00034] Avg episode reward: [(0, '-6.931')] [2024-08-05 05:53:32,622][00139] DAMAGECOUNT value on done: 2167.0 [2024-08-05 05:53:32,910][00139] DAMAGECOUNT value on done: 2416.0 [2024-08-05 05:53:35,483][00034] Fps is (10 sec: 1638.4, 60 sec: 1092.3, 300 sec: 1166.3). Total num frames: 671744. Throughput: 0: 287.5. Samples: 168368. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 05:53:35,485][00034] Avg episode reward: [(0, '-6.934')] [2024-08-05 05:53:35,490][00132] Saving /kaggle/working/train_dir/default_experiment/checkpoint_p0/checkpoint_000000082_671744.pth... [2024-08-05 05:53:35,562][00132] Removing /kaggle/working/train_dir/default_experiment/checkpoint_p0/checkpoint_000000048_393216.pth [2024-08-05 05:53:40,483][00034] Fps is (10 sec: 1638.4, 60 sec: 1228.8, 300 sec: 1166.3). Total num frames: 679936. Throughput: 0: 287.6. Samples: 170122. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 05:53:40,484][00034] Avg episode reward: [(0, '-6.934')] [2024-08-05 05:53:45,483][00034] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1138.5). Total num frames: 679936. Throughput: 0: 287.0. Samples: 170966. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 05:53:45,484][00034] Avg episode reward: [(0, '-6.934')] [2024-08-05 05:53:47,251][00139] DAMAGECOUNT value on done: 2212.0 [2024-08-05 05:53:47,490][00139] DAMAGECOUNT value on done: 2416.0 [2024-08-05 05:53:50,483][00034] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1166.3). Total num frames: 688128. Throughput: 0: 285.9. Samples: 172665. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 05:53:50,485][00034] Avg episode reward: [(0, '-6.936')] [2024-08-05 05:53:55,483][00034] Fps is (10 sec: 1638.4, 60 sec: 1228.8, 300 sec: 1166.3). Total num frames: 696320. Throughput: 0: 285.7. Samples: 174431. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 05:53:55,484][00034] Avg episode reward: [(0, '-6.936')] [2024-08-05 05:54:00,483][00034] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1138.5). Total num frames: 696320. Throughput: 0: 288.2. Samples: 175332. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 05:54:00,485][00034] Avg episode reward: [(0, '-6.936')] [2024-08-05 05:54:01,681][00139] DAMAGECOUNT value on done: 2277.0 [2024-08-05 05:54:01,916][00139] DAMAGECOUNT value on done: 2463.0 [2024-08-05 05:54:01,916][00139] Sum rewards: -3.989, reward structure: {'DEATHCOUNT': '-6.750', 'HEALTH': '-1.229', 'ARMOR': '0.012', 'AMMO2': '0.014', 'weapon4': '0.016', 'HITCOUNT': '0.050', 'WEAPON4': '0.050', 'AMMO4': '0.071', 'AMMO3': '0.124', 'DAMAGECOUNT': '0.141', 'WEAPON3': '0.700', 'weapon3': '0.742', 'FRAGCOUNT': '1.000', 'weapon2': '1.070'} [2024-08-05 05:54:05,483][00034] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1138.5). Total num frames: 704512. Throughput: 0: 288.8. Samples: 177058. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 05:54:05,484][00034] Avg episode reward: [(0, '-6.897')] [2024-08-05 05:54:05,489][00132] Saving new best policy, reward=-6.897! [2024-08-05 05:54:10,483][00034] Fps is (10 sec: 1638.4, 60 sec: 1228.8, 300 sec: 1166.3). Total num frames: 712704. Throughput: 0: 289.9. Samples: 178822. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 05:54:10,484][00034] Avg episode reward: [(0, '-6.897')] [2024-08-05 05:54:15,483][00034] Fps is (10 sec: 1638.4, 60 sec: 1228.8, 300 sec: 1166.3). Total num frames: 720896. Throughput: 0: 291.1. Samples: 179707. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 05:54:15,484][00034] Avg episode reward: [(0, '-6.897')] [2024-08-05 05:54:16,204][00139] DAMAGECOUNT value on done: 2292.0 [2024-08-05 05:54:16,413][00139] DAMAGECOUNT value on done: 2522.0 [2024-08-05 05:54:16,413][00139] Sum rewards: -9.604, reward structure: {'DEATHCOUNT': '-12.000', 'HEALTH': '-2.580', 'AMMO4': '-0.010', 'AMMO2': '-0.002', 'HITCOUNT': '0.060', 'DAMAGECOUNT': '0.177', 'AMMO3': '0.209', 'ARMOR': '0.548', 'weapon2': '0.794', 'FRAGCOUNT': '1.000', 'WEAPON3': '1.050', 'weapon3': '1.150'} [2024-08-05 05:54:20,483][00034] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1138.5). Total num frames: 720896. Throughput: 0: 289.8. Samples: 181408. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 05:54:20,485][00034] Avg episode reward: [(0, '-6.961')] [2024-08-05 05:54:25,483][00034] Fps is (10 sec: 819.2, 60 sec: 1228.8, 300 sec: 1166.3). Total num frames: 729088. Throughput: 0: 289.4. Samples: 183143. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 05:54:25,484][00034] Avg episode reward: [(0, '-6.961')] [2024-08-05 05:54:29,302][00138] Updated weights for policy 0, policy_version 90 (0.0018) [2024-08-05 05:54:30,483][00034] Fps is (10 sec: 1638.4, 60 sec: 1228.8, 300 sec: 1166.3). Total num frames: 737280. Throughput: 0: 290.2. Samples: 184023. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 05:54:30,484][00034] Avg episode reward: [(0, '-6.961')] [2024-08-05 05:54:30,705][00139] DAMAGECOUNT value on done: 2374.0 [2024-08-05 05:54:30,932][00139] DAMAGECOUNT value on done: 2537.0 [2024-08-05 05:54:35,483][00034] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1138.5). Total num frames: 737280. Throughput: 0: 290.7. Samples: 185746. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 05:54:35,484][00034] Avg episode reward: [(0, '-6.995')] [2024-08-05 05:54:40,483][00034] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1138.5). Total num frames: 745472. Throughput: 0: 290.4. Samples: 187500. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 05:54:40,485][00034] Avg episode reward: [(0, '-6.995')] [2024-08-05 05:54:45,120][00139] DAMAGECOUNT value on done: 2439.0 [2024-08-05 05:54:45,120][00139] Sum rewards: -9.331, reward structure: {'DEATHCOUNT': '-9.000', 'HEALTH': '-2.282', 'FRAGCOUNT': '-1.500', 'AMMO5': '0.005', 'weapon5': '0.006', 'ARMOR': '0.024', 'AMMO2': '0.024', 'weapon4': '0.026', 'HITCOUNT': '0.050', 'WEAPON5': '0.050', 'AMMO4': '0.120', 'AMMO3': '0.127', 'DAMAGECOUNT': '0.195', 'WEAPON4': '0.250', 'WEAPON3': '0.550', 'weapon3': '0.904', 'weapon2': '1.120'} [2024-08-05 05:54:45,368][00139] DAMAGECOUNT value on done: 2602.0 [2024-08-05 05:54:45,369][00139] Sum rewards: -6.608, reward structure: {'DEATHCOUNT': '-6.750', 'FRAGCOUNT': '-1.500', 'HEALTH': '-1.208', 'AMMO5': '0.004', 'AMMO2': '0.008', 'WEAPON1': '0.020', 'AMMO4': '0.038', 'HITCOUNT': '0.060', 'weapon5': '0.072', 'AMMO3': '0.085', 'WEAPON5': '0.100', 'DAMAGECOUNT': '0.195', 'WEAPON3': '0.450', 'weapon3': '0.690', 'weapon2': '1.128'} [2024-08-05 05:54:45,483][00034] Fps is (10 sec: 1638.4, 60 sec: 1228.8, 300 sec: 1166.3). Total num frames: 753664. Throughput: 0: 289.8. Samples: 188372. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 05:54:45,485][00034] Avg episode reward: [(0, '-7.016')] [2024-08-05 05:54:50,483][00034] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1138.5). Total num frames: 753664. Throughput: 0: 288.5. Samples: 190039. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 05:54:50,485][00034] Avg episode reward: [(0, '-7.016')] [2024-08-05 05:54:55,483][00034] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1138.5). Total num frames: 761856. Throughput: 0: 288.2. Samples: 191793. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 05:54:55,484][00034] Avg episode reward: [(0, '-7.016')] [2024-08-05 05:54:59,824][00139] DAMAGECOUNT value on done: 3034.0 [2024-08-05 05:54:59,825][00139] Sum rewards: -5.621, reward structure: {'DEATHCOUNT': '-9.000', 'HEALTH': '-1.670', 'ARMOR': '0.012', 'AMMO2': '0.015', 'weapon7': '0.064', 'AMMO4': '0.073', 'AMMO3': '0.096', 'HITCOUNT': '0.120', 'AMMO6': '0.120', 'AMMO7': '0.120', 'weapon4': '0.158', 'WEAPON4': '0.200', 'WEAPON7': '0.200', 'WEAPON3': '0.400', 'weapon3': '0.508', 'DAMAGECOUNT': '0.885', 'FRAGCOUNT': '1.000', 'weapon2': '1.078'} [2024-08-05 05:55:00,064][00139] DAMAGECOUNT value on done: 2656.0 [2024-08-05 05:55:00,483][00034] Fps is (10 sec: 1638.4, 60 sec: 1228.8, 300 sec: 1166.3). Total num frames: 770048. Throughput: 0: 287.5. Samples: 192644. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 05:55:00,484][00034] Avg episode reward: [(0, '-6.996')] [2024-08-05 05:55:05,483][00034] Fps is (10 sec: 1638.4, 60 sec: 1228.8, 300 sec: 1166.3). Total num frames: 778240. Throughput: 0: 288.7. Samples: 194398. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 05:55:05,485][00034] Avg episode reward: [(0, '-6.996')] [2024-08-05 05:55:10,483][00034] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1138.5). Total num frames: 778240. Throughput: 0: 289.6. Samples: 196173. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 05:55:10,485][00034] Avg episode reward: [(0, '-6.996')] [2024-08-05 05:55:14,119][00139] DAMAGECOUNT value on done: 3079.0 [2024-08-05 05:55:14,328][00139] DAMAGECOUNT value on done: 2696.0 [2024-08-05 05:55:15,483][00034] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1138.5). Total num frames: 786432. Throughput: 0: 289.2. Samples: 197038. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 05:55:15,484][00034] Avg episode reward: [(0, '-6.973')] [2024-08-05 05:55:20,483][00034] Fps is (10 sec: 1638.4, 60 sec: 1228.8, 300 sec: 1166.3). Total num frames: 794624. Throughput: 0: 288.5. Samples: 198728. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 05:55:20,484][00034] Avg episode reward: [(0, '-6.973')] [2024-08-05 05:55:25,483][00034] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1138.5). Total num frames: 794624. Throughput: 0: 288.6. Samples: 200487. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 05:55:25,484][00034] Avg episode reward: [(0, '-6.973')] [2024-08-05 05:55:28,847][00139] DAMAGECOUNT value on done: 3106.0 [2024-08-05 05:55:29,088][00139] DAMAGECOUNT value on done: 2711.0 [2024-08-05 05:55:30,483][00034] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1138.5). Total num frames: 802816. Throughput: 0: 288.1. Samples: 201337. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 05:55:30,484][00034] Avg episode reward: [(0, '-6.985')] [2024-08-05 05:55:35,483][00034] Fps is (10 sec: 1638.4, 60 sec: 1228.8, 300 sec: 1166.3). Total num frames: 811008. Throughput: 0: 288.6. Samples: 203024. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 05:55:35,484][00034] Avg episode reward: [(0, '-6.985')] [2024-08-05 05:55:35,490][00132] Saving /kaggle/working/train_dir/default_experiment/checkpoint_p0/checkpoint_000000099_811008.pth... [2024-08-05 05:55:35,572][00132] Removing /kaggle/working/train_dir/default_experiment/checkpoint_p0/checkpoint_000000065_532480.pth [2024-08-05 05:55:40,315][00138] Updated weights for policy 0, policy_version 100 (0.0017) [2024-08-05 05:55:40,483][00034] Fps is (10 sec: 1638.4, 60 sec: 1228.8, 300 sec: 1166.3). Total num frames: 819200. Throughput: 0: 288.5. Samples: 204777. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 05:55:40,484][00034] Avg episode reward: [(0, '-6.985')] [2024-08-05 05:55:43,446][00139] DAMAGECOUNT value on done: 3151.0 [2024-08-05 05:55:43,678][00139] DAMAGECOUNT value on done: 2776.0 [2024-08-05 05:55:43,679][00139] Sum rewards: -6.539, reward structure: {'DEATHCOUNT': '-9.750', 'HEALTH': '-1.010', 'weapon5': '0.002', 'AMMO5': '0.005', 'WEAPON1': '0.020', 'AMMO2': '0.037', 'weapon4': '0.044', 'WEAPON5': '0.050', 'HITCOUNT': '0.060', 'AMMO3': '0.086', 'AMMO4': '0.184', 'DAMAGECOUNT': '0.195', 'WEAPON4': '0.300', 'WEAPON3': '0.400', 'weapon3': '0.588', 'FRAGCOUNT': '1.000', 'weapon2': '1.250'} [2024-08-05 05:55:45,483][00034] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1138.5). Total num frames: 819200. Throughput: 0: 288.9. Samples: 205643. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 05:55:45,484][00034] Avg episode reward: [(0, '-6.998')] [2024-08-05 05:55:50,483][00034] Fps is (10 sec: 819.2, 60 sec: 1228.8, 300 sec: 1138.5). Total num frames: 827392. Throughput: 0: 286.7. Samples: 207300. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 05:55:50,484][00034] Avg episode reward: [(0, '-6.998')] [2024-08-05 05:55:55,483][00034] Fps is (10 sec: 1638.4, 60 sec: 1228.8, 300 sec: 1166.3). Total num frames: 835584. Throughput: 0: 286.0. Samples: 209045. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 05:55:55,484][00034] Avg episode reward: [(0, '-6.998')] [2024-08-05 05:55:58,114][00139] DAMAGECOUNT value on done: 3211.0 [2024-08-05 05:55:58,350][00139] DAMAGECOUNT value on done: 2891.0 [2024-08-05 05:55:58,350][00139] Sum rewards: -4.906, reward structure: {'DEATHCOUNT': '-9.750', 'HEALTH': '-0.710', 'AMMO5': '0.005', 'AMMO2': '0.017', 'HITCOUNT': '0.060', 'AMMO4': '0.085', 'WEAPON5': '0.100', 'AMMO3': '0.123', 'weapon4': '0.196', 'WEAPON4': '0.200', 'DAMAGECOUNT': '0.345', 'WEAPON3': '0.550', 'weapon3': '0.572', 'weapon2': '1.300', 'FRAGCOUNT': '2.000'} [2024-08-05 05:56:00,483][00034] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1138.5). Total num frames: 835584. Throughput: 0: 286.7. Samples: 209940. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 05:56:00,484][00034] Avg episode reward: [(0, '-6.949')] [2024-08-05 05:56:05,483][00034] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1138.5). Total num frames: 843776. Throughput: 0: 288.5. Samples: 211709. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 05:56:05,484][00034] Avg episode reward: [(0, '-6.949')] [2024-08-05 05:56:10,483][00034] Fps is (10 sec: 1638.4, 60 sec: 1228.8, 300 sec: 1166.3). Total num frames: 851968. Throughput: 0: 288.1. Samples: 213451. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 05:56:10,485][00034] Avg episode reward: [(0, '-6.949')] [2024-08-05 05:56:12,502][00139] DAMAGECOUNT value on done: 3215.0 [2024-08-05 05:56:12,722][00139] DAMAGECOUNT value on done: 2891.0 [2024-08-05 05:56:15,483][00034] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1138.5). Total num frames: 851968. Throughput: 0: 288.4. Samples: 214316. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 05:56:15,484][00034] Avg episode reward: [(0, '-6.901')] [2024-08-05 05:56:20,483][00034] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1138.5). Total num frames: 860160. Throughput: 0: 289.3. Samples: 216043. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 05:56:20,484][00034] Avg episode reward: [(0, '-6.901')] [2024-08-05 05:56:25,483][00034] Fps is (10 sec: 1638.4, 60 sec: 1228.8, 300 sec: 1166.3). Total num frames: 868352. Throughput: 0: 287.3. Samples: 217705. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 05:56:25,485][00034] Avg episode reward: [(0, '-6.901')] [2024-08-05 05:56:27,292][00139] DAMAGECOUNT value on done: 3225.0 [2024-08-05 05:56:27,524][00139] DAMAGECOUNT value on done: 3011.0 [2024-08-05 05:56:27,524][00139] Sum rewards: -4.638, reward structure: {'DEATHCOUNT': '-9.000', 'HEALTH': '-0.317', 'AMMO5': '0.003', 'weapon5': '0.006', 'AMMO2': '0.013', 'ARMOR': '0.036', 'WEAPON5': '0.050', 'AMMO4': '0.063', 'HITCOUNT': '0.090', 'WEAPON4': '0.100', 'AMMO3': '0.132', 'DAMAGECOUNT': '0.360', 'WEAPON3': '0.700', 'weapon3': '0.782', 'FRAGCOUNT': '1.000', 'weapon2': '1.344'} [2024-08-05 05:56:30,483][00034] Fps is (10 sec: 1638.4, 60 sec: 1228.8, 300 sec: 1166.3). Total num frames: 876544. Throughput: 0: 287.4. Samples: 218575. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 05:56:30,484][00034] Avg episode reward: [(0, '-6.851')] [2024-08-05 05:56:30,487][00132] Saving new best policy, reward=-6.851! [2024-08-05 05:56:35,483][00034] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1138.5). Total num frames: 876544. Throughput: 0: 289.4. Samples: 220323. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 05:56:35,485][00034] Avg episode reward: [(0, '-6.851')] [2024-08-05 05:56:40,483][00034] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1138.5). Total num frames: 884736. Throughput: 0: 289.3. Samples: 222064. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 05:56:40,485][00034] Avg episode reward: [(0, '-6.851')] [2024-08-05 05:56:41,736][00139] DAMAGECOUNT value on done: 3225.0 [2024-08-05 05:56:41,963][00139] DAMAGECOUNT value on done: 3046.0 [2024-08-05 05:56:45,483][00034] Fps is (10 sec: 1638.4, 60 sec: 1228.8, 300 sec: 1166.3). Total num frames: 892928. Throughput: 0: 288.8. Samples: 222938. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 05:56:45,484][00034] Avg episode reward: [(0, '-6.804')] [2024-08-05 05:56:45,492][00132] Saving new best policy, reward=-6.804! [2024-08-05 05:56:50,483][00034] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1138.5). Total num frames: 892928. Throughput: 0: 287.4. Samples: 224642. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 05:56:50,485][00034] Avg episode reward: [(0, '-6.804')] [2024-08-05 05:56:51,431][00138] Updated weights for policy 0, policy_version 110 (0.0018) [2024-08-05 05:56:55,483][00034] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1138.5). Total num frames: 901120. Throughput: 0: 285.9. Samples: 226315. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 05:56:55,485][00034] Avg episode reward: [(0, '-6.804')] [2024-08-05 05:56:56,523][00139] DAMAGECOUNT value on done: 3310.0 [2024-08-05 05:56:56,756][00139] DAMAGECOUNT value on done: 3061.0 [2024-08-05 05:57:00,483][00034] Fps is (10 sec: 1638.4, 60 sec: 1228.8, 300 sec: 1166.3). Total num frames: 909312. Throughput: 0: 286.1. Samples: 227192. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 05:57:00,485][00034] Avg episode reward: [(0, '-6.763')] [2024-08-05 05:57:00,486][00132] Saving new best policy, reward=-6.763! [2024-08-05 05:57:05,483][00034] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1138.5). Total num frames: 909312. Throughput: 0: 286.6. Samples: 228939. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 05:57:05,485][00034] Avg episode reward: [(0, '-6.763')] [2024-08-05 05:57:10,483][00034] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1138.5). Total num frames: 917504. Throughput: 0: 287.4. Samples: 230640. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 05:57:10,484][00034] Avg episode reward: [(0, '-6.763')] [2024-08-05 05:57:11,159][00139] DAMAGECOUNT value on done: 3340.0 [2024-08-05 05:57:11,390][00139] DAMAGECOUNT value on done: 3071.0 [2024-08-05 05:57:15,483][00034] Fps is (10 sec: 1638.4, 60 sec: 1228.8, 300 sec: 1166.3). Total num frames: 925696. Throughput: 0: 286.7. Samples: 231478. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 05:57:15,484][00034] Avg episode reward: [(0, '-6.799')] [2024-08-05 05:57:20,483][00034] Fps is (10 sec: 1638.4, 60 sec: 1228.8, 300 sec: 1166.3). Total num frames: 933888. Throughput: 0: 287.1. Samples: 233242. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 05:57:20,486][00034] Avg episode reward: [(0, '-6.799')] [2024-08-05 05:57:25,483][00034] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1138.5). Total num frames: 933888. Throughput: 0: 286.0. Samples: 234936. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 05:57:25,484][00034] Avg episode reward: [(0, '-6.799')] [2024-08-05 05:57:25,829][00139] DAMAGECOUNT value on done: 3360.0 [2024-08-05 05:57:26,066][00139] DAMAGECOUNT value on done: 3182.0 [2024-08-05 05:57:26,066][00139] Sum rewards: -6.779, reward structure: {'DEATHCOUNT': '-8.250', 'FRAGCOUNT': '-1.500', 'HEALTH': '-0.254', 'AMMO5': '0.010', 'AMMO2': '0.023', 'ARMOR': '0.032', 'AMMO3': '0.069', 'weapon5': '0.092', 'HITCOUNT': '0.100', 'AMMO4': '0.114', 'weapon4': '0.154', 'WEAPON4': '0.200', 'WEAPON5': '0.200', 'DAMAGECOUNT': '0.333', 'weapon3': '0.336', 'WEAPON3': '0.350', 'weapon2': '1.212'} [2024-08-05 05:57:30,483][00034] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1138.5). Total num frames: 942080. Throughput: 0: 285.4. Samples: 235779. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 05:57:30,484][00034] Avg episode reward: [(0, '-6.781')] [2024-08-05 05:57:35,483][00034] Fps is (10 sec: 1638.4, 60 sec: 1228.8, 300 sec: 1166.3). Total num frames: 950272. Throughput: 0: 286.9. Samples: 237551. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 05:57:35,484][00034] Avg episode reward: [(0, '-6.781')] [2024-08-05 05:57:35,493][00132] Saving /kaggle/working/train_dir/default_experiment/checkpoint_p0/checkpoint_000000116_950272.pth... [2024-08-05 05:57:35,563][00132] Removing /kaggle/working/train_dir/default_experiment/checkpoint_p0/checkpoint_000000082_671744.pth [2024-08-05 05:57:40,330][00139] DAMAGECOUNT value on done: 3435.0 [2024-08-05 05:57:40,483][00034] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1138.5). Total num frames: 950272. Throughput: 0: 288.0. Samples: 239277. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 05:57:40,484][00034] Avg episode reward: [(0, '-6.741')] [2024-08-05 05:57:40,486][00132] Saving new best policy, reward=-6.741! [2024-08-05 05:57:40,575][00139] DAMAGECOUNT value on done: 3277.0 [2024-08-05 05:57:45,483][00034] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1138.5). Total num frames: 958464. Throughput: 0: 287.3. Samples: 240122. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 05:57:45,485][00034] Avg episode reward: [(0, '-6.758')] [2024-08-05 05:57:50,483][00034] Fps is (10 sec: 1638.4, 60 sec: 1228.8, 300 sec: 1166.3). Total num frames: 966656. Throughput: 0: 287.6. Samples: 241882. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 05:57:50,484][00034] Avg episode reward: [(0, '-6.758')] [2024-08-05 05:57:54,799][00139] DAMAGECOUNT value on done: 3435.0 [2024-08-05 05:57:55,077][00139] DAMAGECOUNT value on done: 3317.0 [2024-08-05 05:57:55,078][00139] Sum rewards: -3.432, reward structure: {'DEATHCOUNT': '-6.750', 'HEALTH': '-0.712', 'AMMO2': '0.020', 'weapon4': '0.046', 'HITCOUNT': '0.050', 'ARMOR': '0.076', 'AMMO4': '0.098', 'WEAPON4': '0.100', 'AMMO3': '0.106', 'DAMAGECOUNT': '0.120', 'WEAPON3': '0.500', 'weapon3': '0.796', 'FRAGCOUNT': '1.000', 'weapon2': '1.118'} [2024-08-05 05:57:55,483][00034] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1138.5). Total num frames: 966656. Throughput: 0: 288.2. Samples: 243610. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 05:57:55,485][00034] Avg episode reward: [(0, '-6.748')] [2024-08-05 05:58:00,483][00034] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1138.5). Total num frames: 974848. Throughput: 0: 288.3. Samples: 244450. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 05:58:00,485][00034] Avg episode reward: [(0, '-6.748')] [2024-08-05 05:58:02,775][00138] Updated weights for policy 0, policy_version 120 (0.0017) [2024-08-05 05:58:05,483][00034] Fps is (10 sec: 1638.4, 60 sec: 1228.8, 300 sec: 1166.3). Total num frames: 983040. Throughput: 0: 287.0. Samples: 246158. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 05:58:05,485][00034] Avg episode reward: [(0, '-6.748')] [2024-08-05 05:58:09,631][00139] DAMAGECOUNT value on done: 3540.0 [2024-08-05 05:58:09,631][00139] Sum rewards: -5.516, reward structure: {'DEATHCOUNT': '-8.250', 'HEALTH': '-1.590', 'WEAPON1': '0.020', 'AMMO2': '0.030', 'HITCOUNT': '0.050', 'weapon4': '0.056', 'AMMO3': '0.108', 'AMMO4': '0.151', 'WEAPON4': '0.200', 'DAMAGECOUNT': '0.315', 'WEAPON3': '0.500', 'weapon2': '0.936', 'weapon3': '0.958', 'FRAGCOUNT': '1.000'} [2024-08-05 05:58:09,866][00139] DAMAGECOUNT value on done: 3417.0 [2024-08-05 05:58:09,867][00139] Sum rewards: -8.595, reward structure: {'DEATHCOUNT': '-12.000', 'HEALTH': '-1.092', 'AMMO2': '0.016', 'HITCOUNT': '0.060', 'AMMO4': '0.077', 'ARMOR': '0.080', 'WEAPON4': '0.100', 'weapon4': '0.124', 'AMMO3': '0.136', 'DAMAGECOUNT': '0.300', 'WEAPON3': '0.700', 'weapon3': '0.708', 'FRAGCOUNT': '1.000', 'weapon2': '1.196'} [2024-08-05 05:58:10,483][00034] Fps is (10 sec: 1638.4, 60 sec: 1228.8, 300 sec: 1166.3). Total num frames: 991232. Throughput: 0: 287.1. Samples: 247855. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 05:58:10,484][00034] Avg episode reward: [(0, '-6.732')] [2024-08-05 05:58:10,486][00132] Saving new best policy, reward=-6.732! [2024-08-05 05:58:15,483][00034] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1138.5). Total num frames: 991232. Throughput: 0: 287.4. Samples: 248713. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 05:58:15,486][00034] Avg episode reward: [(0, '-6.732')] [2024-08-05 05:58:20,483][00034] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1166.3). Total num frames: 999424. Throughput: 0: 287.1. Samples: 250470. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 05:58:20,485][00034] Avg episode reward: [(0, '-6.732')] [2024-08-05 05:58:24,247][00139] DAMAGECOUNT value on done: 3540.0 [2024-08-05 05:58:24,494][00139] DAMAGECOUNT value on done: 3592.0 [2024-08-05 05:58:24,495][00139] Sum rewards: -9.357, reward structure: {'DEATHCOUNT': '-12.000', 'HEALTH': '-2.754', 'AMMO2': '0.015', 'ARMOR': '0.020', 'weapon4': '0.048', 'AMMO4': '0.076', 'HITCOUNT': '0.150', 'WEAPON4': '0.200', 'AMMO3': '0.214', 'DAMAGECOUNT': '0.525', 'weapon3': '0.814', 'FRAGCOUNT': '1.000', 'WEAPON3': '1.150', 'weapon2': '1.184'} [2024-08-05 05:58:25,483][00034] Fps is (10 sec: 1638.4, 60 sec: 1228.8, 300 sec: 1166.3). Total num frames: 1007616. Throughput: 0: 286.7. Samples: 252177. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 05:58:25,484][00034] Avg episode reward: [(0, '-6.751')] [2024-08-05 05:58:30,483][00034] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1138.5). Total num frames: 1007616. Throughput: 0: 285.8. Samples: 252981. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 05:58:30,484][00034] Avg episode reward: [(0, '-6.751')] [2024-08-05 05:58:35,483][00034] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1138.5). Total num frames: 1015808. Throughput: 0: 285.1. Samples: 254711. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 05:58:35,485][00034] Avg episode reward: [(0, '-6.751')] [2024-08-05 05:58:39,091][00139] DAMAGECOUNT value on done: 3545.0 [2024-08-05 05:58:39,322][00139] DAMAGECOUNT value on done: 3757.0 [2024-08-05 05:58:40,483][00034] Fps is (10 sec: 1638.4, 60 sec: 1228.8, 300 sec: 1166.3). Total num frames: 1024000. Throughput: 0: 284.7. Samples: 256423. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 05:58:40,484][00034] Avg episode reward: [(0, '-6.664')] [2024-08-05 05:58:40,486][00132] Saving new best policy, reward=-6.664! [2024-08-05 05:58:45,483][00034] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1138.5). Total num frames: 1024000. Throughput: 0: 285.4. Samples: 257295. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 05:58:45,484][00034] Avg episode reward: [(0, '-6.664')] [2024-08-05 05:58:50,483][00034] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1138.5). Total num frames: 1032192. Throughput: 0: 285.8. Samples: 259017. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 05:58:50,484][00034] Avg episode reward: [(0, '-6.664')] [2024-08-05 05:58:53,683][00139] DAMAGECOUNT value on done: 3697.0 [2024-08-05 05:58:53,683][00139] Sum rewards: -3.367, reward structure: {'DEATHCOUNT': '-7.500', 'HEALTH': '-0.860', 'AMMO4': '-0.000', 'AMMO2': '-0.000', 'AMMO5': '0.005', 'WEAPON1': '0.020', 'weapon5': '0.062', 'HITCOUNT': '0.100', 'WEAPON5': '0.100', 'AMMO3': '0.108', 'DAMAGECOUNT': '0.456', 'ARMOR': '0.500', 'WEAPON3': '0.650', 'weapon2': '0.776', 'FRAGCOUNT': '1.000', 'weapon3': '1.216'} [2024-08-05 05:58:53,906][00139] DAMAGECOUNT value on done: 3789.0 [2024-08-05 05:58:55,483][00034] Fps is (10 sec: 1638.4, 60 sec: 1228.8, 300 sec: 1166.3). Total num frames: 1040384. Throughput: 0: 286.8. Samples: 260761. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 05:58:55,484][00034] Avg episode reward: [(0, '-6.554')] [2024-08-05 05:58:55,492][00132] Saving new best policy, reward=-6.554! [2024-08-05 05:59:00,483][00034] Fps is (10 sec: 1638.4, 60 sec: 1228.8, 300 sec: 1166.3). Total num frames: 1048576. Throughput: 0: 286.9. Samples: 261625. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 05:59:00,484][00034] Avg episode reward: [(0, '-6.554')] [2024-08-05 05:59:05,483][00034] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1138.5). Total num frames: 1048576. Throughput: 0: 286.7. Samples: 263370. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 05:59:05,484][00034] Avg episode reward: [(0, '-6.554')] [2024-08-05 05:59:08,178][00139] DAMAGECOUNT value on done: 3772.0 [2024-08-05 05:59:08,405][00139] DAMAGECOUNT value on done: 3864.0 [2024-08-05 05:59:08,406][00139] Sum rewards: -2.086, reward structure: {'DEATHCOUNT': '-6.750', 'AMMO5': '0.007', 'AMMO2': '0.025', 'ARMOR': '0.040', 'HITCOUNT': '0.070', 'AMMO3': '0.086', 'AMMO4': '0.124', 'weapon4': '0.136', 'WEAPON4': '0.150', 'WEAPON5': '0.150', 'DAMAGECOUNT': '0.225', 'HEALTH': '0.312', 'WEAPON3': '0.450', 'weapon3': '0.826', 'FRAGCOUNT': '1.000', 'weapon2': '1.062'} [2024-08-05 05:59:10,483][00034] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1138.5). Total num frames: 1056768. Throughput: 0: 286.8. Samples: 265081. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 05:59:10,485][00034] Avg episode reward: [(0, '-6.473')] [2024-08-05 05:59:10,486][00132] Saving new best policy, reward=-6.473! [2024-08-05 05:59:14,136][00138] Updated weights for policy 0, policy_version 130 (0.0017) [2024-08-05 05:59:15,483][00034] Fps is (10 sec: 1638.4, 60 sec: 1228.8, 300 sec: 1166.3). Total num frames: 1064960. Throughput: 0: 288.5. Samples: 265964. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 05:59:15,484][00034] Avg episode reward: [(0, '-6.473')] [2024-08-05 05:59:20,483][00034] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1138.5). Total num frames: 1064960. Throughput: 0: 289.0. Samples: 267716. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 05:59:20,485][00034] Avg episode reward: [(0, '-6.473')] [2024-08-05 05:59:22,706][00139] DAMAGECOUNT value on done: 3803.0 [2024-08-05 05:59:22,706][00139] Sum rewards: -8.617, reward structure: {'DEATHCOUNT': '-9.750', 'HEALTH': '-3.392', 'AMMO2': '0.012', 'HITCOUNT': '0.040', 'AMMO4': '0.058', 'weapon4': '0.090', 'DAMAGECOUNT': '0.093', 'AMMO3': '0.137', 'WEAPON4': '0.200', 'ARMOR': '0.491', 'weapon3': '0.570', 'WEAPON3': '0.650', 'FRAGCOUNT': '1.000', 'weapon2': '1.184'} [2024-08-05 05:59:22,932][00139] DAMAGECOUNT value on done: 3904.0 [2024-08-05 05:59:22,932][00139] Sum rewards: -5.833, reward structure: {'DEATHCOUNT': '-8.250', 'HEALTH': '-1.844', 'AMMO2': '0.005', 'AMMO4': '0.023', 'HITCOUNT': '0.070', 'ARMOR': '0.072', 'WEAPON4': '0.100', 'DAMAGECOUNT': '0.120', 'weapon4': '0.128', 'AMMO3': '0.145', 'WEAPON3': '0.700', 'weapon3': '0.928', 'weapon2': '0.970', 'FRAGCOUNT': '1.000'} [2024-08-05 05:59:25,483][00034] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1138.5). Total num frames: 1073152. Throughput: 0: 289.6. Samples: 269453. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 05:59:25,484][00034] Avg episode reward: [(0, '-6.468')] [2024-08-05 05:59:25,492][00132] Saving new best policy, reward=-6.468! [2024-08-05 05:59:30,483][00034] Fps is (10 sec: 1638.3, 60 sec: 1228.8, 300 sec: 1166.3). Total num frames: 1081344. Throughput: 0: 289.4. Samples: 270317. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 05:59:30,486][00034] Avg episode reward: [(0, '-6.468')] [2024-08-05 05:59:35,483][00034] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1138.5). Total num frames: 1081344. Throughput: 0: 286.6. Samples: 271914. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 05:59:35,484][00034] Avg episode reward: [(0, '-6.468')] [2024-08-05 05:59:35,493][00132] Saving /kaggle/working/train_dir/default_experiment/checkpoint_p0/checkpoint_000000132_1081344.pth... [2024-08-05 05:59:35,560][00132] Removing /kaggle/working/train_dir/default_experiment/checkpoint_p0/checkpoint_000000099_811008.pth [2024-08-05 05:59:37,696][00139] DAMAGECOUNT value on done: 3838.0 [2024-08-05 05:59:37,931][00139] DAMAGECOUNT value on done: 3934.0 [2024-08-05 05:59:40,483][00034] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1138.5). Total num frames: 1089536. Throughput: 0: 286.1. Samples: 273634. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 05:59:40,484][00034] Avg episode reward: [(0, '-6.538')] [2024-08-05 05:59:45,483][00034] Fps is (10 sec: 1638.4, 60 sec: 1228.8, 300 sec: 1166.3). Total num frames: 1097728. Throughput: 0: 286.0. Samples: 274493. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 05:59:45,484][00034] Avg episode reward: [(0, '-6.538')] [2024-08-05 05:59:50,483][00034] Fps is (10 sec: 1638.4, 60 sec: 1228.8, 300 sec: 1166.3). Total num frames: 1105920. Throughput: 0: 287.2. Samples: 276296. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 05:59:50,484][00034] Avg episode reward: [(0, '-6.538')] [2024-08-05 05:59:52,159][00139] DAMAGECOUNT value on done: 4013.0 [2024-08-05 05:59:52,159][00139] Sum rewards: -3.541, reward structure: {'DEATHCOUNT': '-7.500', 'HEALTH': '-0.590', 'AMMO4': '-0.010', 'AMMO2': '-0.002', 'weapon5': '0.006', 'AMMO5': '0.009', 'AMMO3': '0.084', 'HITCOUNT': '0.110', 'WEAPON5': '0.200', 'WEAPON3': '0.450', 'ARMOR': '0.496', 'FRAGCOUNT': '0.500', 'DAMAGECOUNT': '0.525', 'weapon3': '0.888', 'weapon2': '1.292'} [2024-08-05 05:59:52,396][00139] DAMAGECOUNT value on done: 4030.0 [2024-08-05 05:59:52,396][00139] Sum rewards: -8.764, reward structure: {'DEATHCOUNT': '-11.250', 'HEALTH': '-2.366', 'AMMO2': '0.021', 'weapon4': '0.046', 'HITCOUNT': '0.090', 'ARMOR': '0.104', 'AMMO4': '0.106', 'WEAPON4': '0.150', 'AMMO3': '0.173', 'DAMAGECOUNT': '0.288', 'weapon3': '0.694', 'WEAPON3': '0.900', 'FRAGCOUNT': '1.000', 'weapon2': '1.280'} [2024-08-05 05:59:55,483][00034] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1138.5). Total num frames: 1105920. Throughput: 0: 287.8. Samples: 278031. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 05:59:55,486][00034] Avg episode reward: [(0, '-6.513')] [2024-08-05 06:00:00,483][00034] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1138.5). Total num frames: 1114112. Throughput: 0: 287.0. Samples: 278880. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 06:00:00,484][00034] Avg episode reward: [(0, '-6.513')] [2024-08-05 06:00:05,483][00034] Fps is (10 sec: 1638.4, 60 sec: 1228.8, 300 sec: 1166.3). Total num frames: 1122304. Throughput: 0: 286.0. Samples: 280585. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 06:00:05,484][00034] Avg episode reward: [(0, '-6.513')] [2024-08-05 06:00:06,652][00139] DAMAGECOUNT value on done: 4046.0 [2024-08-05 06:00:06,653][00139] Sum rewards: -6.552, reward structure: {'DEATHCOUNT': '-9.000', 'FRAGCOUNT': '-1.500', 'AMMO5': '0.009', 'AMMO2': '0.032', 'HITCOUNT': '0.040', 'weapon5': '0.056', 'AMMO3': '0.086', 'DAMAGECOUNT': '0.099', 'ARMOR': '0.114', 'WEAPON5': '0.150', 'weapon4': '0.158', 'AMMO4': '0.159', 'WEAPON4': '0.200', 'HEALTH': '0.374', 'WEAPON3': '0.400', 'weapon3': '0.788', 'weapon2': '1.282'} [2024-08-05 06:00:06,864][00139] DAMAGECOUNT value on done: 4055.0 [2024-08-05 06:00:10,483][00034] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1138.5). Total num frames: 1122304. Throughput: 0: 286.8. Samples: 282357. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 06:00:10,484][00034] Avg episode reward: [(0, '-6.524')] [2024-08-05 06:00:15,483][00034] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1138.5). Total num frames: 1130496. Throughput: 0: 286.6. Samples: 283215. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 06:00:15,484][00034] Avg episode reward: [(0, '-6.524')] [2024-08-05 06:00:20,483][00034] Fps is (10 sec: 1638.4, 60 sec: 1228.8, 300 sec: 1166.3). Total num frames: 1138688. Throughput: 0: 290.2. Samples: 284973. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 06:00:20,484][00034] Avg episode reward: [(0, '-6.524')] [2024-08-05 06:00:21,054][00139] DAMAGECOUNT value on done: 4071.0 [2024-08-05 06:00:21,283][00139] DAMAGECOUNT value on done: 4070.0 [2024-08-05 06:00:25,240][00138] Updated weights for policy 0, policy_version 140 (0.0018) [2024-08-05 06:00:25,483][00034] Fps is (10 sec: 1638.4, 60 sec: 1228.8, 300 sec: 1166.3). Total num frames: 1146880. Throughput: 0: 290.7. Samples: 286715. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 06:00:25,485][00034] Avg episode reward: [(0, '-6.608')] [2024-08-05 06:00:30,483][00034] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1138.5). Total num frames: 1146880. Throughput: 0: 290.7. Samples: 287576. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 06:00:30,484][00034] Avg episode reward: [(0, '-6.608')] [2024-08-05 06:00:35,483][00034] Fps is (10 sec: 819.2, 60 sec: 1228.8, 300 sec: 1138.5). Total num frames: 1155072. Throughput: 0: 288.0. Samples: 289256. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 06:00:35,484][00034] Avg episode reward: [(0, '-6.608')] [2024-08-05 06:00:35,731][00139] DAMAGECOUNT value on done: 4091.0 [2024-08-05 06:00:35,970][00139] DAMAGECOUNT value on done: 4170.0 [2024-08-05 06:00:35,970][00139] Sum rewards: -7.717, reward structure: {'DEATHCOUNT': '-10.500', 'HEALTH': '-1.580', 'AMMO5': '0.008', 'AMMO2': '0.008', 'weapon5': '0.020', 'AMMO4': '0.040', 'HITCOUNT': '0.050', 'weapon4': '0.054', 'ARMOR': '0.076', 'AMMO3': '0.097', 'WEAPON4': '0.100', 'WEAPON5': '0.150', 'DAMAGECOUNT': '0.300', 'WEAPON3': '0.500', 'weapon3': '0.528', 'FRAGCOUNT': '1.000', 'weapon2': '1.432'} [2024-08-05 06:00:40,483][00034] Fps is (10 sec: 1638.4, 60 sec: 1228.8, 300 sec: 1166.3). Total num frames: 1163264. Throughput: 0: 288.6. Samples: 291016. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 06:00:40,484][00034] Avg episode reward: [(0, '-6.597')] [2024-08-05 06:00:45,483][00034] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1138.5). Total num frames: 1163264. Throughput: 0: 289.2. Samples: 291895. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 06:00:45,484][00034] Avg episode reward: [(0, '-6.597')] [2024-08-05 06:00:50,239][00139] DAMAGECOUNT value on done: 4213.0 [2024-08-05 06:00:50,472][00139] DAMAGECOUNT value on done: 4234.0 [2024-08-05 06:00:50,473][00139] Sum rewards: -9.881, reward structure: {'DEATHCOUNT': '-12.000', 'HEALTH': '-2.712', 'AMMO2': '0.019', 'HITCOUNT': '0.080', 'ARMOR': '0.096', 'AMMO4': '0.097', 'weapon4': '0.156', 'AMMO3': '0.177', 'DAMAGECOUNT': '0.192', 'WEAPON4': '0.250', 'weapon3': '0.768', 'WEAPON3': '0.850', 'FRAGCOUNT': '1.000', 'weapon2': '1.146'} [2024-08-05 06:00:50,483][00034] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1138.5). Total num frames: 1171456. Throughput: 0: 289.7. Samples: 293621. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 06:00:50,484][00034] Avg episode reward: [(0, '-6.580')] [2024-08-05 06:00:55,483][00034] Fps is (10 sec: 1638.4, 60 sec: 1228.8, 300 sec: 1166.3). Total num frames: 1179648. Throughput: 0: 289.0. Samples: 295363. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 06:00:55,484][00034] Avg episode reward: [(0, '-6.559')] [2024-08-05 06:01:00,483][00034] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1138.5). Total num frames: 1179648. Throughput: 0: 288.8. Samples: 296209. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 06:01:00,484][00034] Avg episode reward: [(0, '-6.559')] [2024-08-05 06:01:04,849][00139] DAMAGECOUNT value on done: 4223.0 [2024-08-05 06:01:05,114][00139] DAMAGECOUNT value on done: 4361.0 [2024-08-05 06:01:05,114][00139] Sum rewards: -7.450, reward structure: {'DEATHCOUNT': '-9.750', 'HEALTH': '-0.706', 'FRAGCOUNT': '-0.500', 'AMMO5': '0.006', 'AMMO2': '0.017', 'weapon4': '0.020', 'ARMOR': '0.044', 'AMMO4': '0.085', 'AMMO3': '0.085', 'weapon5': '0.110', 'HITCOUNT': '0.130', 'WEAPON4': '0.150', 'WEAPON5': '0.150', 'DAMAGECOUNT': '0.381', 'WEAPON3': '0.450', 'weapon3': '0.628', 'weapon2': '1.250'} [2024-08-05 06:01:05,483][00034] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1138.5). Total num frames: 1187840. Throughput: 0: 288.4. Samples: 297949. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 06:01:05,484][00034] Avg episode reward: [(0, '-6.567')] [2024-08-05 06:01:10,483][00034] Fps is (10 sec: 1638.4, 60 sec: 1228.8, 300 sec: 1166.3). Total num frames: 1196032. Throughput: 0: 286.8. Samples: 299620. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 06:01:10,484][00034] Avg episode reward: [(0, '-6.567')] [2024-08-05 06:01:15,483][00034] Fps is (10 sec: 1638.4, 60 sec: 1228.8, 300 sec: 1166.3). Total num frames: 1204224. Throughput: 0: 287.0. Samples: 300490. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 06:01:15,484][00034] Avg episode reward: [(0, '-6.567')] [2024-08-05 06:01:19,426][00139] DAMAGECOUNT value on done: 4308.0 [2024-08-05 06:01:19,427][00139] Sum rewards: -3.035, reward structure: {'DEATHCOUNT': '-6.000', 'HEALTH': '-1.188', 'AMMO5': '0.005', 'AMMO2': '0.006', 'WEAPON1': '0.010', 'ARMOR': '0.016', 'weapon4': '0.028', 'AMMO4': '0.032', 'HITCOUNT': '0.050', 'WEAPON4': '0.050', 'weapon5': '0.056', 'AMMO3': '0.080', 'WEAPON5': '0.100', 'DAMAGECOUNT': '0.255', 'WEAPON3': '0.500', 'weapon3': '0.794', 'FRAGCOUNT': '1.000', 'weapon2': '1.170'} [2024-08-05 06:01:19,641][00139] DAMAGECOUNT value on done: 4391.0 [2024-08-05 06:01:19,641][00139] Sum rewards: -4.877, reward structure: {'DEATHCOUNT': '-9.000', 'AMMO5': '0.003', 'AMMO2': '0.014', 'HEALTH': '0.020', 'HITCOUNT': '0.030', 'WEAPON5': '0.050', 'ARMOR': '0.059', 'AMMO4': '0.071', 'DAMAGECOUNT': '0.090', 'AMMO3': '0.108', 'WEAPON4': '0.150', 'weapon4': '0.190', 'WEAPON3': '0.550', 'weapon3': '0.778', 'FRAGCOUNT': '1.000', 'weapon2': '1.010'} [2024-08-05 06:01:20,483][00034] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1138.5). Total num frames: 1204224. Throughput: 0: 288.6. Samples: 302242. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 06:01:20,484][00034] Avg episode reward: [(0, '-6.508')] [2024-08-05 06:01:25,483][00034] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1138.5). Total num frames: 1212416. Throughput: 0: 288.6. Samples: 304001. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 06:01:25,485][00034] Avg episode reward: [(0, '-6.508')] [2024-08-05 06:01:30,483][00034] Fps is (10 sec: 1638.4, 60 sec: 1228.8, 300 sec: 1166.3). Total num frames: 1220608. Throughput: 0: 288.3. Samples: 304867. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 06:01:30,485][00034] Avg episode reward: [(0, '-6.508')] [2024-08-05 06:01:33,896][00139] DAMAGECOUNT value on done: 4358.0 [2024-08-05 06:01:34,108][00139] DAMAGECOUNT value on done: 4426.0 [2024-08-05 06:01:35,483][00034] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1138.5). Total num frames: 1220608. Throughput: 0: 288.5. Samples: 306603. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 06:01:35,485][00034] Avg episode reward: [(0, '-6.502')] [2024-08-05 06:01:35,492][00132] Saving /kaggle/working/train_dir/default_experiment/checkpoint_p0/checkpoint_000000149_1220608.pth... [2024-08-05 06:01:35,571][00132] Removing /kaggle/working/train_dir/default_experiment/checkpoint_p0/checkpoint_000000116_950272.pth [2024-08-05 06:01:36,387][00138] Updated weights for policy 0, policy_version 150 (0.0017) [2024-08-05 06:01:40,483][00034] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1138.5). Total num frames: 1228800. Throughput: 0: 286.6. Samples: 308258. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 06:01:40,484][00034] Avg episode reward: [(0, '-6.502')] [2024-08-05 06:01:45,483][00034] Fps is (10 sec: 1638.4, 60 sec: 1228.8, 300 sec: 1166.3). Total num frames: 1236992. Throughput: 0: 287.4. Samples: 309140. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 06:01:45,484][00034] Avg episode reward: [(0, '-6.502')] [2024-08-05 06:01:45,775][00139] Large shaping reward -2.549 for [('FRAGCOUNT', -1.5, -1.0), ('DEATHCOUNT', -0.75, 1.0), ('HEALTH', -0.3, -100.0), ('AMMO5', -0.0005, -1.0), ('weapon5', 0.002)] [2024-08-05 06:01:48,565][00139] DAMAGECOUNT value on done: 4373.0 [2024-08-05 06:01:48,793][00139] DAMAGECOUNT value on done: 4553.0 [2024-08-05 06:01:48,794][00139] Sum rewards: -10.904, reward structure: {'DEATHCOUNT': '-12.750', 'HEALTH': '-1.673', 'FRAGCOUNT': '-0.500', 'AMMO2': '0.008', 'AMMO5': '0.009', 'weapon5': '0.014', 'weapon4': '0.030', 'AMMO4': '0.038', 'WEAPON4': '0.100', 'HITCOUNT': '0.120', 'AMMO3': '0.133', 'WEAPON5': '0.200', 'DAMAGECOUNT': '0.381', 'ARMOR': '0.408', 'weapon3': '0.678', 'WEAPON3': '0.700', 'weapon2': '1.200'} [2024-08-05 06:01:50,483][00034] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1138.6). Total num frames: 1236992. Throughput: 0: 287.4. Samples: 310884. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 06:01:50,485][00034] Avg episode reward: [(0, '-6.564')] [2024-08-05 06:01:55,483][00034] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1138.5). Total num frames: 1245184. Throughput: 0: 290.2. Samples: 312678. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 06:01:55,485][00034] Avg episode reward: [(0, '-6.564')] [2024-08-05 06:02:00,483][00034] Fps is (10 sec: 1638.4, 60 sec: 1228.8, 300 sec: 1166.3). Total num frames: 1253376. Throughput: 0: 289.4. Samples: 313512. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 06:02:00,484][00034] Avg episode reward: [(0, '-6.564')] [2024-08-05 06:02:03,144][00139] DAMAGECOUNT value on done: 4553.0 [2024-08-05 06:02:03,145][00139] Sum rewards: -4.684, reward structure: {'DEATHCOUNT': '-7.500', 'HEALTH': '-2.030', 'AMMO5': '0.007', 'AMMO2': '0.012', 'weapon5': '0.024', 'AMMO4': '0.059', 'weapon4': '0.080', 'HITCOUNT': '0.090', 'ARMOR': '0.096', 'AMMO3': '0.115', 'WEAPON5': '0.150', 'WEAPON4': '0.200', 'DAMAGECOUNT': '0.540', 'WEAPON3': '0.600', 'weapon3': '0.654', 'FRAGCOUNT': '1.000', 'weapon2': '1.218'} [2024-08-05 06:02:03,370][00139] DAMAGECOUNT value on done: 4583.0 [2024-08-05 06:02:03,371][00139] Sum rewards: -1.769, reward structure: {'DEATHCOUNT': '-5.250', 'HEALTH': '-0.398', 'AMMO5': '0.003', 'WEAPON1': '0.010', 'AMMO2': '0.027', 'HITCOUNT': '0.030', 'WEAPON5': '0.050', 'AMMO3': '0.066', 'ARMOR': '0.087', 'DAMAGECOUNT': '0.090', 'weapon4': '0.094', 'AMMO4': '0.136', 'weapon3': '0.154', 'WEAPON4': '0.250', 'WEAPON3': '0.300', 'FRAGCOUNT': '1.000', 'weapon2': '1.582'} [2024-08-05 06:02:05,483][00034] Fps is (10 sec: 1638.4, 60 sec: 1228.8, 300 sec: 1166.3). Total num frames: 1261568. Throughput: 0: 288.3. Samples: 315214. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 06:02:05,484][00034] Avg episode reward: [(0, '-6.520')] [2024-08-05 06:02:10,483][00034] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1138.5). Total num frames: 1261568. Throughput: 0: 286.9. Samples: 316913. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 06:02:10,484][00034] Avg episode reward: [(0, '-6.520')] [2024-08-05 06:02:15,483][00034] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1138.5). Total num frames: 1269760. Throughput: 0: 286.7. Samples: 317770. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 06:02:15,485][00034] Avg episode reward: [(0, '-6.520')] [2024-08-05 06:02:17,859][00139] DAMAGECOUNT value on done: 4638.0 [2024-08-05 06:02:18,085][00139] DAMAGECOUNT value on done: 4676.0 [2024-08-05 06:02:20,483][00034] Fps is (10 sec: 1638.4, 60 sec: 1228.8, 300 sec: 1166.3). Total num frames: 1277952. Throughput: 0: 286.4. Samples: 319493. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 06:02:20,484][00034] Avg episode reward: [(0, '-6.468')] [2024-08-05 06:02:25,483][00034] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1138.5). Total num frames: 1277952. Throughput: 0: 289.9. Samples: 321302. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 06:02:25,485][00034] Avg episode reward: [(0, '-6.468')] [2024-08-05 06:02:30,483][00034] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1138.5). Total num frames: 1286144. Throughput: 0: 289.5. Samples: 322166. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 06:02:30,484][00034] Avg episode reward: [(0, '-6.468')] [2024-08-05 06:02:32,056][00139] DAMAGECOUNT value on done: 4673.0 [2024-08-05 06:02:32,057][00139] Sum rewards: -5.823, reward structure: {'DEATHCOUNT': '-9.000', 'HEALTH': '-1.340', 'AMMO2': '0.014', 'HITCOUNT': '0.040', 'ARMOR': '0.056', 'AMMO4': '0.072', 'weapon4': '0.084', 'DAMAGECOUNT': '0.105', 'WEAPON4': '0.150', 'AMMO3': '0.158', 'WEAPON3': '0.750', 'weapon3': '0.964', 'FRAGCOUNT': '1.000', 'weapon2': '1.124'} [2024-08-05 06:02:32,267][00139] DAMAGECOUNT value on done: 4691.0 [2024-08-05 06:02:35,483][00034] Fps is (10 sec: 1638.4, 60 sec: 1228.8, 300 sec: 1166.3). Total num frames: 1294336. Throughput: 0: 289.5. Samples: 323910. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 06:02:35,485][00034] Avg episode reward: [(0, '-6.503')] [2024-08-05 06:02:40,483][00034] Fps is (10 sec: 1638.4, 60 sec: 1228.8, 300 sec: 1166.3). Total num frames: 1302528. Throughput: 0: 286.8. Samples: 325582. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 06:02:40,484][00034] Avg episode reward: [(0, '-6.503')] [2024-08-05 06:02:45,483][00034] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1138.5). Total num frames: 1302528. Throughput: 0: 286.8. Samples: 326416. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 06:02:45,484][00034] Avg episode reward: [(0, '-6.503')] [2024-08-05 06:02:46,930][00139] DAMAGECOUNT value on done: 4673.0 [2024-08-05 06:02:47,177][00139] DAMAGECOUNT value on done: 4691.0 [2024-08-05 06:02:47,550][00138] Updated weights for policy 0, policy_version 160 (0.0018) [2024-08-05 06:02:50,483][00034] Fps is (10 sec: 819.2, 60 sec: 1228.8, 300 sec: 1166.3). Total num frames: 1310720. Throughput: 0: 287.2. Samples: 328137. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 06:02:50,485][00034] Avg episode reward: [(0, '-6.498')] [2024-08-05 06:02:55,483][00034] Fps is (10 sec: 1638.4, 60 sec: 1228.8, 300 sec: 1166.3). Total num frames: 1318912. Throughput: 0: 288.2. Samples: 329882. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 06:02:55,484][00034] Avg episode reward: [(0, '-6.498')] [2024-08-05 06:03:00,483][00034] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1138.5). Total num frames: 1318912. Throughput: 0: 289.0. Samples: 330776. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 06:03:00,485][00034] Avg episode reward: [(0, '-6.498')] [2024-08-05 06:03:01,473][00139] DAMAGECOUNT value on done: 4723.0 [2024-08-05 06:03:01,750][00139] DAMAGECOUNT value on done: 4701.0 [2024-08-05 06:03:05,483][00034] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1138.5). Total num frames: 1327104. Throughput: 0: 288.9. Samples: 332494. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 06:03:05,484][00034] Avg episode reward: [(0, '-6.555')] [2024-08-05 06:03:10,483][00034] Fps is (10 sec: 1638.4, 60 sec: 1228.8, 300 sec: 1166.3). Total num frames: 1335296. Throughput: 0: 287.4. Samples: 334236. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 06:03:10,484][00034] Avg episode reward: [(0, '-6.555')] [2024-08-05 06:03:15,483][00034] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1138.5). Total num frames: 1335296. Throughput: 0: 287.1. Samples: 335085. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 06:03:15,484][00034] Avg episode reward: [(0, '-6.555')] [2024-08-05 06:03:16,085][00139] DAMAGECOUNT value on done: 5033.0 [2024-08-05 06:03:16,085][00139] Sum rewards: -5.455, reward structure: {'DEATHCOUNT': '-11.250', 'HEALTH': '-1.888', 'AMMO5': '0.010', 'weapon5': '0.010', 'AMMO2': '0.016', 'ARMOR': '0.032', 'weapon4': '0.072', 'AMMO4': '0.081', 'WEAPON5': '0.100', 'WEAPON4': '0.150', 'AMMO3': '0.156', 'HITCOUNT': '0.210', 'WEAPON3': '0.750', 'DAMAGECOUNT': '0.930', 'weapon3': '0.964', 'weapon2': '1.202', 'FRAGCOUNT': '3.000'} [2024-08-05 06:03:16,322][00139] DAMAGECOUNT value on done: 4786.0 [2024-08-05 06:03:20,483][00034] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1138.5). Total num frames: 1343488. Throughput: 0: 286.8. Samples: 336815. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 06:03:20,485][00034] Avg episode reward: [(0, '-6.519')] [2024-08-05 06:03:25,483][00034] Fps is (10 sec: 1638.4, 60 sec: 1228.8, 300 sec: 1166.3). Total num frames: 1351680. Throughput: 0: 287.0. Samples: 338499. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 06:03:25,485][00034] Avg episode reward: [(0, '-6.519')] [2024-08-05 06:03:30,483][00034] Fps is (10 sec: 1638.4, 60 sec: 1228.8, 300 sec: 1166.3). Total num frames: 1359872. Throughput: 0: 287.7. Samples: 339363. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 06:03:30,484][00034] Avg episode reward: [(0, '-6.519')] [2024-08-05 06:03:30,887][00139] DAMAGECOUNT value on done: 5133.0 [2024-08-05 06:03:30,888][00139] Sum rewards: -4.440, reward structure: {'DEATHCOUNT': '-7.500', 'HEALTH': '-2.089', 'WEAPON1': '0.010', 'AMMO5': '0.012', 'weapon5': '0.018', 'AMMO2': '0.022', 'HITCOUNT': '0.050', 'weapon4': '0.070', 'AMMO4': '0.112', 'ARMOR': '0.127', 'AMMO3': '0.141', 'WEAPON5': '0.150', 'WEAPON4': '0.250', 'DAMAGECOUNT': '0.300', 'WEAPON3': '0.800', 'weapon3': '0.858', 'FRAGCOUNT': '1.000', 'weapon2': '1.228'} [2024-08-05 06:03:31,140][00139] DAMAGECOUNT value on done: 4833.0 [2024-08-05 06:03:35,483][00034] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1138.5). Total num frames: 1359872. Throughput: 0: 287.0. Samples: 341054. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 06:03:35,485][00034] Avg episode reward: [(0, '-6.490')] [2024-08-05 06:03:35,493][00132] Saving /kaggle/working/train_dir/default_experiment/checkpoint_p0/checkpoint_000000166_1359872.pth... [2024-08-05 06:03:35,569][00132] Removing /kaggle/working/train_dir/default_experiment/checkpoint_p0/checkpoint_000000132_1081344.pth [2024-08-05 06:03:40,483][00034] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1166.3). Total num frames: 1368064. Throughput: 0: 286.6. Samples: 342777. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 06:03:40,484][00034] Avg episode reward: [(0, '-6.490')] [2024-08-05 06:03:45,483][00034] Fps is (10 sec: 1638.4, 60 sec: 1228.8, 300 sec: 1166.3). Total num frames: 1376256. Throughput: 0: 284.5. Samples: 343577. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 06:03:45,485][00034] Avg episode reward: [(0, '-6.490')] [2024-08-05 06:03:45,898][00139] DAMAGECOUNT value on done: 5243.0 [2024-08-05 06:03:45,898][00139] Sum rewards: -7.066, reward structure: {'DEATHCOUNT': '-9.750', 'HEALTH': '-2.135', 'weapon5': '0.002', 'AMMO5': '0.005', 'AMMO2': '0.007', 'weapon4': '0.016', 'WEAPON1': '0.020', 'AMMO4': '0.037', 'ARMOR': '0.044', 'HITCOUNT': '0.090', 'WEAPON5': '0.100', 'WEAPON4': '0.100', 'AMMO3': '0.167', 'DAMAGECOUNT': '0.330', 'WEAPON3': '0.800', 'weapon2': '0.932', 'FRAGCOUNT': '1.000', 'weapon3': '1.168'} [2024-08-05 06:03:46,121][00139] DAMAGECOUNT value on done: 5018.0 [2024-08-05 06:03:46,122][00139] Sum rewards: -3.984, reward structure: {'DEATHCOUNT': '-8.250', 'HEALTH': '-1.507', 'AMMO5': '0.004', 'AMMO2': '0.014', 'WEAPON1': '0.020', 'ARMOR': '0.042', 'weapon5': '0.050', 'AMMO4': '0.072', 'WEAPON4': '0.100', 'WEAPON5': '0.100', 'AMMO3': '0.105', 'weapon4': '0.112', 'HITCOUNT': '0.130', 'DAMAGECOUNT': '0.555', 'WEAPON3': '0.600', 'weapon2': '0.906', 'weapon3': '0.962', 'FRAGCOUNT': '2.000'} [2024-08-05 06:03:50,483][00034] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1138.5). Total num frames: 1376256. Throughput: 0: 283.3. Samples: 345243. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 06:03:50,484][00034] Avg episode reward: [(0, '-6.454')] [2024-08-05 06:03:50,486][00132] Saving new best policy, reward=-6.454! [2024-08-05 06:03:55,483][00034] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1138.5). Total num frames: 1384448. Throughput: 0: 281.8. Samples: 346916. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 06:03:55,485][00034] Avg episode reward: [(0, '-6.454')] [2024-08-05 06:03:59,493][00138] Updated weights for policy 0, policy_version 170 (0.0017) [2024-08-05 06:04:00,483][00034] Fps is (10 sec: 1638.4, 60 sec: 1228.8, 300 sec: 1166.3). Total num frames: 1392640. Throughput: 0: 282.1. Samples: 347779. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 06:04:00,484][00034] Avg episode reward: [(0, '-6.454')] [2024-08-05 06:04:00,746][00139] DAMAGECOUNT value on done: 5273.0 [2024-08-05 06:04:00,961][00139] DAMAGECOUNT value on done: 5053.0 [2024-08-05 06:04:00,962][00139] Sum rewards: -8.680, reward structure: {'DEATHCOUNT': '-10.500', 'HEALTH': '-2.454', 'AMMO2': '0.006', 'weapon4': '0.016', 'AMMO4': '0.030', 'ARMOR': '0.038', 'HITCOUNT': '0.040', 'WEAPON4': '0.100', 'DAMAGECOUNT': '0.105', 'AMMO3': '0.142', 'WEAPON3': '0.800', 'weapon3': '0.902', 'FRAGCOUNT': '1.000', 'weapon2': '1.094'} [2024-08-05 06:04:05,483][00034] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1138.5). Total num frames: 1392640. Throughput: 0: 282.3. Samples: 349519. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 06:04:05,485][00034] Avg episode reward: [(0, '-6.454')] [2024-08-05 06:04:10,483][00034] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1138.5). Total num frames: 1400832. Throughput: 0: 283.7. Samples: 351267. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 06:04:10,484][00034] Avg episode reward: [(0, '-6.454')] [2024-08-05 06:04:15,465][00139] DAMAGECOUNT value on done: 5368.0 [2024-08-05 06:04:15,466][00139] Sum rewards: -9.209, reward structure: {'DEATHCOUNT': '-9.750', 'HEALTH': '-2.423', 'FRAGCOUNT': '-1.500', 'AMMO5': '0.009', 'AMMO2': '0.030', 'weapon4': '0.036', 'weapon5': '0.072', 'HITCOUNT': '0.080', 'AMMO3': '0.117', 'AMMO4': '0.150', 'WEAPON5': '0.200', 'DAMAGECOUNT': '0.285', 'WEAPON4': '0.350', 'ARMOR': '0.508', 'WEAPON3': '0.650', 'weapon2': '0.958', 'weapon3': '1.020'} [2024-08-05 06:04:15,483][00034] Fps is (10 sec: 1638.4, 60 sec: 1228.8, 300 sec: 1166.3). Total num frames: 1409024. Throughput: 0: 283.5. Samples: 352121. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 06:04:15,484][00034] Avg episode reward: [(0, '-6.428')] [2024-08-05 06:04:15,492][00132] Saving new best policy, reward=-6.428! [2024-08-05 06:04:15,716][00139] DAMAGECOUNT value on done: 5088.0 [2024-08-05 06:04:20,483][00034] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1138.5). Total num frames: 1409024. Throughput: 0: 282.2. Samples: 353755. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 06:04:20,485][00034] Avg episode reward: [(0, '-6.482')] [2024-08-05 06:04:25,483][00034] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1138.6). Total num frames: 1417216. Throughput: 0: 281.7. Samples: 355452. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 06:04:25,485][00034] Avg episode reward: [(0, '-6.482')] [2024-08-05 06:04:30,298][00139] DAMAGECOUNT value on done: 5388.0 [2024-08-05 06:04:30,483][00034] Fps is (10 sec: 1638.4, 60 sec: 1092.3, 300 sec: 1166.3). Total num frames: 1425408. Throughput: 0: 283.4. Samples: 356330. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 06:04:30,484][00034] Avg episode reward: [(0, '-6.487')] [2024-08-05 06:04:30,520][00139] DAMAGECOUNT value on done: 5187.0 [2024-08-05 06:04:30,521][00139] Sum rewards: -7.260, reward structure: {'DEATHCOUNT': '-10.500', 'HEALTH': '-1.506', 'AMMO5': '0.014', 'AMMO2': '0.017', 'weapon5': '0.040', 'weapon4': '0.048', 'ARMOR': '0.056', 'HITCOUNT': '0.060', 'AMMO4': '0.082', 'AMMO3': '0.115', 'WEAPON5': '0.200', 'WEAPON4': '0.200', 'DAMAGECOUNT': '0.297', 'WEAPON3': '0.600', 'weapon3': '0.670', 'FRAGCOUNT': '1.000', 'weapon2': '1.346'} [2024-08-05 06:04:35,483][00034] Fps is (10 sec: 1638.4, 60 sec: 1228.8, 300 sec: 1166.3). Total num frames: 1433600. Throughput: 0: 284.9. Samples: 358064. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 06:04:35,484][00034] Avg episode reward: [(0, '-6.501')] [2024-08-05 06:04:40,483][00034] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1138.5). Total num frames: 1433600. Throughput: 0: 286.4. Samples: 359805. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 06:04:40,484][00034] Avg episode reward: [(0, '-6.501')] [2024-08-05 06:04:44,694][00139] DAMAGECOUNT value on done: 5493.0 [2024-08-05 06:04:44,924][00139] DAMAGECOUNT value on done: 5252.0 [2024-08-05 06:04:44,924][00139] Sum rewards: -3.713, reward structure: {'DEATHCOUNT': '-6.750', 'HEALTH': '-1.688', 'WEAPON1': '0.020', 'AMMO2': '0.026', 'ARMOR': '0.040', 'HITCOUNT': '0.050', 'AMMO4': '0.129', 'AMMO3': '0.158', 'DAMAGECOUNT': '0.195', 'WEAPON3': '0.900', 'FRAGCOUNT': '1.000', 'weapon2': '1.012', 'weapon3': '1.196'} [2024-08-05 06:04:45,483][00034] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1138.5). Total num frames: 1441792. Throughput: 0: 286.9. Samples: 360688. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 06:04:45,485][00034] Avg episode reward: [(0, '-6.607')] [2024-08-05 06:04:50,483][00034] Fps is (10 sec: 1638.4, 60 sec: 1228.8, 300 sec: 1166.3). Total num frames: 1449984. Throughput: 0: 285.4. Samples: 362362. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 06:04:50,485][00034] Avg episode reward: [(0, '-6.607')] [2024-08-05 06:04:55,483][00034] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1138.5). Total num frames: 1449984. Throughput: 0: 284.1. Samples: 364053. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 06:04:55,484][00034] Avg episode reward: [(0, '-6.607')] [2024-08-05 06:04:59,610][00139] DAMAGECOUNT value on done: 5508.0 [2024-08-05 06:04:59,840][00139] DAMAGECOUNT value on done: 5262.0 [2024-08-05 06:05:00,483][00034] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1138.5). Total num frames: 1458176. Throughput: 0: 284.2. Samples: 364911. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 06:05:00,484][00034] Avg episode reward: [(0, '-6.643')] [2024-08-05 06:05:05,483][00034] Fps is (10 sec: 1638.4, 60 sec: 1228.8, 300 sec: 1166.3). Total num frames: 1466368. Throughput: 0: 286.8. Samples: 366662. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 06:05:05,484][00034] Avg episode reward: [(0, '-6.643')] [2024-08-05 06:05:10,483][00034] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1138.5). Total num frames: 1466368. Throughput: 0: 288.1. Samples: 368415. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 06:05:10,484][00034] Avg episode reward: [(0, '-6.643')] [2024-08-05 06:05:10,953][00138] Updated weights for policy 0, policy_version 180 (0.0016) [2024-08-05 06:05:13,984][00139] DAMAGECOUNT value on done: 5573.0 [2024-08-05 06:05:14,192][00139] DAMAGECOUNT value on done: 5507.0 [2024-08-05 06:05:14,192][00139] Sum rewards: -5.271, reward structure: {'DEATHCOUNT': '-9.750', 'HEALTH': '-1.105', 'AMMO5': '0.003', 'AMMO2': '0.028', 'weapon4': '0.032', 'WEAPON5': '0.050', 'ARMOR': '0.072', 'AMMO4': '0.141', 'AMMO3': '0.151', 'HITCOUNT': '0.170', 'WEAPON4': '0.250', 'DAMAGECOUNT': '0.735', 'WEAPON3': '0.800', 'FRAGCOUNT': '1.000', 'weapon2': '1.014', 'weapon3': '1.138'} [2024-08-05 06:05:15,483][00034] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1138.5). Total num frames: 1474560. Throughput: 0: 288.0. Samples: 369292. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 06:05:15,484][00034] Avg episode reward: [(0, '-6.627')] [2024-08-05 06:05:20,483][00034] Fps is (10 sec: 1638.4, 60 sec: 1228.8, 300 sec: 1138.5). Total num frames: 1482752. Throughput: 0: 287.3. Samples: 370992. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 06:05:20,484][00034] Avg episode reward: [(0, '-6.627')] [2024-08-05 06:05:25,483][00034] Fps is (10 sec: 1638.4, 60 sec: 1228.8, 300 sec: 1166.3). Total num frames: 1490944. Throughput: 0: 287.6. Samples: 372749. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 06:05:25,485][00034] Avg episode reward: [(0, '-6.627')] [2024-08-05 06:05:28,593][00139] DAMAGECOUNT value on done: 5663.0 [2024-08-05 06:05:28,594][00139] Sum rewards: -8.154, reward structure: {'DEATHCOUNT': '-9.750', 'HEALTH': '-1.758', 'FRAGCOUNT': '-0.500', 'AMMO2': '0.007', 'AMMO5': '0.010', 'WEAPON1': '0.010', 'weapon5': '0.010', 'weapon4': '0.024', 'AMMO4': '0.034', 'HITCOUNT': '0.070', 'WEAPON4': '0.100', 'AMMO3': '0.183', 'WEAPON5': '0.200', 'DAMAGECOUNT': '0.270', 'WEAPON3': '0.750', 'weapon3': '0.898', 'weapon2': '1.288'} [2024-08-05 06:05:28,857][00139] DAMAGECOUNT value on done: 5529.0 [2024-08-05 06:05:30,483][00034] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1138.5). Total num frames: 1490944. Throughput: 0: 287.0. Samples: 373603. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 06:05:30,485][00034] Avg episode reward: [(0, '-6.680')] [2024-08-05 06:05:35,483][00034] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1138.5). Total num frames: 1499136. Throughput: 0: 288.2. Samples: 375331. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 06:05:35,485][00034] Avg episode reward: [(0, '-6.680')] [2024-08-05 06:05:35,492][00132] Saving /kaggle/working/train_dir/default_experiment/checkpoint_p0/checkpoint_000000183_1499136.pth... [2024-08-05 06:05:35,564][00132] Removing /kaggle/working/train_dir/default_experiment/checkpoint_p0/checkpoint_000000149_1220608.pth [2024-08-05 06:05:40,483][00034] Fps is (10 sec: 1638.4, 60 sec: 1228.8, 300 sec: 1166.3). Total num frames: 1507328. Throughput: 0: 288.0. Samples: 377015. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 06:05:40,484][00034] Avg episode reward: [(0, '-6.680')] [2024-08-05 06:05:43,283][00139] DAMAGECOUNT value on done: 5678.0 [2024-08-05 06:05:43,513][00139] DAMAGECOUNT value on done: 5589.0 [2024-08-05 06:05:45,483][00034] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1138.5). Total num frames: 1507328. Throughput: 0: 288.4. Samples: 377891. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 06:05:45,484][00034] Avg episode reward: [(0, '-6.688')] [2024-08-05 06:05:50,483][00034] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1138.5). Total num frames: 1515520. Throughput: 0: 287.3. Samples: 379589. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 06:05:50,484][00034] Avg episode reward: [(0, '-6.688')] [2024-08-05 06:05:55,483][00034] Fps is (10 sec: 1638.4, 60 sec: 1228.8, 300 sec: 1166.3). Total num frames: 1523712. Throughput: 0: 286.4. Samples: 381305. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 06:05:55,485][00034] Avg episode reward: [(0, '-6.688')] [2024-08-05 06:05:57,862][00139] DAMAGECOUNT value on done: 5705.0 [2024-08-05 06:05:57,862][00139] Sum rewards: -5.599, reward structure: {'DEATHCOUNT': '-9.750', 'HEALTH': '-0.628', 'AMMO2': '0.004', 'AMMO5': '0.007', 'AMMO4': '0.021', 'WEAPON4': '0.050', 'WEAPON5': '0.050', 'HITCOUNT': '0.050', 'DAMAGECOUNT': '0.081', 'AMMO3': '0.100', 'weapon4': '0.112', 'WEAPON3': '0.450', 'ARMOR': '0.505', 'FRAGCOUNT': '1.000', 'weapon3': '1.048', 'weapon2': '1.300'} [2024-08-05 06:05:58,108][00139] DAMAGECOUNT value on done: 5629.0 [2024-08-05 06:06:00,483][00034] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1138.5). Total num frames: 1523712. Throughput: 0: 286.9. Samples: 382203. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 06:06:00,484][00034] Avg episode reward: [(0, '-6.651')] [2024-08-05 06:06:05,483][00034] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1138.5). Total num frames: 1531904. Throughput: 0: 288.7. Samples: 383985. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 06:06:05,484][00034] Avg episode reward: [(0, '-6.651')] [2024-08-05 06:06:10,483][00034] Fps is (10 sec: 1638.4, 60 sec: 1228.8, 300 sec: 1138.5). Total num frames: 1540096. Throughput: 0: 288.5. Samples: 385730. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 06:06:10,484][00034] Avg episode reward: [(0, '-6.651')] [2024-08-05 06:06:12,261][00139] DAMAGECOUNT value on done: 5755.0 [2024-08-05 06:06:12,497][00139] DAMAGECOUNT value on done: 5714.0 [2024-08-05 06:06:15,483][00034] Fps is (10 sec: 1638.4, 60 sec: 1228.8, 300 sec: 1166.3). Total num frames: 1548288. Throughput: 0: 288.4. Samples: 386583. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 06:06:15,484][00034] Avg episode reward: [(0, '-6.652')] [2024-08-05 06:06:20,483][00034] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1138.5). Total num frames: 1548288. Throughput: 0: 289.1. Samples: 388339. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 06:06:20,487][00034] Avg episode reward: [(0, '-6.652')] [2024-08-05 06:06:22,119][00138] Updated weights for policy 0, policy_version 190 (0.0017) [2024-08-05 06:06:25,483][00034] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1138.5). Total num frames: 1556480. Throughput: 0: 289.6. Samples: 390046. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 06:06:25,484][00034] Avg episode reward: [(0, '-6.652')] [2024-08-05 06:06:26,809][00139] DAMAGECOUNT value on done: 5840.0 [2024-08-05 06:06:26,810][00139] Sum rewards: -3.700, reward structure: {'DEATHCOUNT': '-8.250', 'AMMO2': '0.013', 'HEALTH': '0.032', 'ARMOR': '0.036', 'weapon4': '0.042', 'AMMO4': '0.062', 'HITCOUNT': '0.070', 'WEAPON4': '0.100', 'AMMO3': '0.122', 'DAMAGECOUNT': '0.255', 'WEAPON3': '0.500', 'FRAGCOUNT': '1.000', 'weapon2': '1.114', 'weapon3': '1.204'} [2024-08-05 06:06:27,025][00139] DAMAGECOUNT value on done: 6007.0 [2024-08-05 06:06:27,026][00139] Sum rewards: -3.499, reward structure: {'DEATHCOUNT': '-9.750', 'HEALTH': '-0.266', 'AMMO2': '0.009', 'AMMO4': '0.046', 'ARMOR': '0.052', 'WEAPON4': '0.100', 'AMMO3': '0.112', 'HITCOUNT': '0.250', 'WEAPON3': '0.650', 'DAMAGECOUNT': '0.879', 'weapon3': '1.102', 'weapon2': '1.316', 'FRAGCOUNT': '2.000'} [2024-08-05 06:06:30,483][00034] Fps is (10 sec: 1638.4, 60 sec: 1228.8, 300 sec: 1166.3). Total num frames: 1564672. Throughput: 0: 289.5. Samples: 390917. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 06:06:30,484][00034] Avg episode reward: [(0, '-6.532')] [2024-08-05 06:06:35,483][00034] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1138.5). Total num frames: 1564672. Throughput: 0: 290.1. Samples: 392644. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 06:06:35,484][00034] Avg episode reward: [(0, '-6.532')] [2024-08-05 06:06:40,483][00034] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1138.5). Total num frames: 1572864. Throughput: 0: 289.6. Samples: 394337. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 06:06:40,485][00034] Avg episode reward: [(0, '-6.532')] [2024-08-05 06:06:41,525][00139] DAMAGECOUNT value on done: 5860.0 [2024-08-05 06:06:41,734][00139] DAMAGECOUNT value on done: 6232.0 [2024-08-05 06:06:41,735][00139] Sum rewards: -3.993, reward structure: {'DEATHCOUNT': '-8.250', 'HEALTH': '-1.240', 'AMMO5': '0.003', 'AMMO2': '0.020', 'ARMOR': '0.036', 'weapon4': '0.046', 'WEAPON5': '0.050', 'AMMO4': '0.097', 'AMMO3': '0.146', 'WEAPON4': '0.150', 'HITCOUNT': '0.220', 'DAMAGECOUNT': '0.675', 'WEAPON3': '0.800', 'weapon3': '0.994', 'FRAGCOUNT': '1.000', 'weapon2': '1.260'} [2024-08-05 06:06:45,483][00034] Fps is (10 sec: 1638.4, 60 sec: 1228.8, 300 sec: 1166.3). Total num frames: 1581056. Throughput: 0: 289.1. Samples: 395212. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 06:06:45,485][00034] Avg episode reward: [(0, '-6.476')] [2024-08-05 06:06:50,483][00034] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1138.5). Total num frames: 1581056. Throughput: 0: 287.4. Samples: 396920. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 06:06:50,484][00034] Avg episode reward: [(0, '-6.476')] [2024-08-05 06:06:55,483][00034] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1138.5). Total num frames: 1589248. Throughput: 0: 285.6. Samples: 398581. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 06:06:55,484][00034] Avg episode reward: [(0, '-6.476')] [2024-08-05 06:06:56,428][00139] DAMAGECOUNT value on done: 5945.0 [2024-08-05 06:06:56,429][00139] Sum rewards: -5.872, reward structure: {'DEATHCOUNT': '-6.750', 'HEALTH': '-1.864', 'FRAGCOUNT': '-0.500', 'AMMO4': '-0.022', 'AMMO2': '-0.004', 'AMMO5': '0.004', 'ARMOR': '0.008', 'HITCOUNT': '0.050', 'weapon5': '0.056', 'AMMO3': '0.087', 'WEAPON5': '0.100', 'DAMAGECOUNT': '0.255', 'WEAPON3': '0.600', 'weapon2': '0.686', 'weapon3': '1.422'} [2024-08-05 06:06:56,674][00139] DAMAGECOUNT value on done: 6413.0 [2024-08-05 06:06:56,675][00139] Sum rewards: -4.677, reward structure: {'DEATHCOUNT': '-8.250', 'HEALTH': '-1.242', 'weapon4': '0.010', 'AMMO2': '0.014', 'ARMOR': '0.029', 'WEAPON4': '0.050', 'AMMO4': '0.068', 'AMMO3': '0.135', 'HITCOUNT': '0.150', 'WEAPON3': '0.500', 'DAMAGECOUNT': '0.543', 'weapon3': '0.762', 'FRAGCOUNT': '1.000', 'weapon2': '1.554'} [2024-08-05 06:07:00,483][00034] Fps is (10 sec: 1638.4, 60 sec: 1228.8, 300 sec: 1138.5). Total num frames: 1597440. Throughput: 0: 284.8. Samples: 399399. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 06:07:00,485][00034] Avg episode reward: [(0, '-6.422')] [2024-08-05 06:07:00,487][00132] Saving new best policy, reward=-6.422! [2024-08-05 06:07:05,483][00034] Fps is (10 sec: 1638.4, 60 sec: 1228.8, 300 sec: 1166.3). Total num frames: 1605632. Throughput: 0: 284.1. Samples: 401125. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 06:07:05,484][00034] Avg episode reward: [(0, '-6.422')] [2024-08-05 06:07:10,483][00034] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1138.5). Total num frames: 1605632. Throughput: 0: 284.1. Samples: 402830. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 06:07:10,484][00034] Avg episode reward: [(0, '-6.422')] [2024-08-05 06:07:11,166][00139] DAMAGECOUNT value on done: 5995.0 [2024-08-05 06:07:11,398][00139] DAMAGECOUNT value on done: 6518.0 [2024-08-05 06:07:11,398][00139] Sum rewards: -3.779, reward structure: {'DEATHCOUNT': '-7.500', 'HEALTH': '-0.550', 'AMMO2': '0.021', 'ARMOR': '0.034', 'HITCOUNT': '0.070', 'AMMO3': '0.085', 'AMMO4': '0.104', 'weapon4': '0.222', 'WEAPON4': '0.250', 'WEAPON3': '0.300', 'DAMAGECOUNT': '0.315', 'weapon3': '0.498', 'FRAGCOUNT': '1.000', 'weapon2': '1.372'} [2024-08-05 06:07:15,483][00034] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1138.5). Total num frames: 1613824. Throughput: 0: 283.8. Samples: 403690. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 06:07:15,485][00034] Avg episode reward: [(0, '-6.452')] [2024-08-05 06:07:20,483][00034] Fps is (10 sec: 1638.4, 60 sec: 1228.8, 300 sec: 1166.3). Total num frames: 1622016. Throughput: 0: 284.0. Samples: 405426. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 06:07:20,484][00034] Avg episode reward: [(0, '-6.452')] [2024-08-05 06:07:25,483][00034] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1138.5). Total num frames: 1622016. Throughput: 0: 284.4. Samples: 407137. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 06:07:25,485][00034] Avg episode reward: [(0, '-6.452')] [2024-08-05 06:07:25,844][00139] DAMAGECOUNT value on done: 6110.0 [2024-08-05 06:07:26,070][00139] DAMAGECOUNT value on done: 6558.0 [2024-08-05 06:07:30,483][00034] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1138.5). Total num frames: 1630208. Throughput: 0: 284.1. Samples: 407998. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 06:07:30,484][00034] Avg episode reward: [(0, '-6.483')] [2024-08-05 06:07:33,850][00138] Updated weights for policy 0, policy_version 200 (0.0017) [2024-08-05 06:07:35,483][00034] Fps is (10 sec: 1638.4, 60 sec: 1228.8, 300 sec: 1138.5). Total num frames: 1638400. Throughput: 0: 283.8. Samples: 409693. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 06:07:35,486][00034] Avg episode reward: [(0, '-6.483')] [2024-08-05 06:07:35,494][00132] Saving /kaggle/working/train_dir/default_experiment/checkpoint_p0/checkpoint_000000200_1638400.pth... [2024-08-05 06:07:35,578][00132] Removing /kaggle/working/train_dir/default_experiment/checkpoint_p0/checkpoint_000000166_1359872.pth [2024-08-05 06:07:40,398][00139] DAMAGECOUNT value on done: 6120.0 [2024-08-05 06:07:40,398][00139] Sum rewards: -4.397, reward structure: {'DEATHCOUNT': '-8.250', 'HEALTH': '-0.010', 'AMMO2': '0.001', 'AMMO5': '0.003', 'AMMO4': '0.003', 'HITCOUNT': '0.010', 'DAMAGECOUNT': '0.030', 'ARMOR': '0.040', 'AMMO3': '0.102', 'WEAPON3': '0.450', 'weapon3': '0.824', 'FRAGCOUNT': '1.000', 'weapon2': '1.400'} [2024-08-05 06:07:40,483][00034] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1138.5). Total num frames: 1638400. Throughput: 0: 286.2. Samples: 411458. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 06:07:40,484][00034] Avg episode reward: [(0, '-6.442')] [2024-08-05 06:07:40,618][00139] DAMAGECOUNT value on done: 6558.0 [2024-08-05 06:07:45,483][00034] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1138.5). Total num frames: 1646592. Throughput: 0: 287.1. Samples: 412318. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 06:07:45,484][00034] Avg episode reward: [(0, '-6.478')] [2024-08-05 06:07:50,483][00034] Fps is (10 sec: 1638.4, 60 sec: 1228.8, 300 sec: 1138.5). Total num frames: 1654784. Throughput: 0: 286.8. Samples: 414030. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 06:07:50,485][00034] Avg episode reward: [(0, '-6.478')] [2024-08-05 06:07:54,907][00139] DAMAGECOUNT value on done: 6129.0 [2024-08-05 06:07:55,225][00139] DAMAGECOUNT value on done: 6613.0 [2024-08-05 06:07:55,483][00034] Fps is (10 sec: 1638.4, 60 sec: 1228.8, 300 sec: 1166.3). Total num frames: 1662976. Throughput: 0: 287.8. Samples: 415779. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 06:07:55,485][00034] Avg episode reward: [(0, '-6.457')] [2024-08-05 06:08:00,483][00034] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1138.5). Total num frames: 1662976. Throughput: 0: 287.1. Samples: 416611. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 06:08:00,484][00034] Avg episode reward: [(0, '-6.457')] [2024-08-05 06:08:05,483][00034] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1138.5). Total num frames: 1671168. Throughput: 0: 286.1. Samples: 418300. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 06:08:05,484][00034] Avg episode reward: [(0, '-6.457')] [2024-08-05 06:08:09,818][00139] DAMAGECOUNT value on done: 6144.0 [2024-08-05 06:08:09,818][00139] Sum rewards: -3.661, reward structure: {'DEATHCOUNT': '-6.750', 'HEALTH': '-1.450', 'HITCOUNT': '0.010', 'AMMO2': '0.026', 'ARMOR': '0.040', 'DAMAGECOUNT': '0.045', 'AMMO3': '0.096', 'WEAPON4': '0.100', 'weapon4': '0.102', 'AMMO4': '0.128', 'WEAPON3': '0.600', 'FRAGCOUNT': '1.000', 'weapon2': '1.046', 'weapon3': '1.346'} [2024-08-05 06:08:10,025][00139] DAMAGECOUNT value on done: 6783.0 [2024-08-05 06:08:10,026][00139] Sum rewards: -6.343, reward structure: {'DEATHCOUNT': '-9.750', 'HEALTH': '-1.910', 'AMMO4': '-0.000', 'AMMO2': '0.000', 'AMMO5': '0.010', 'weapon5': '0.020', 'WEAPON4': '0.050', 'weapon4': '0.064', 'HITCOUNT': '0.100', 'AMMO3': '0.169', 'WEAPON5': '0.200', 'DAMAGECOUNT': '0.510', 'WEAPON3': '0.850', 'FRAGCOUNT': '1.000', 'weapon2': '1.044', 'weapon3': '1.300'} [2024-08-05 06:08:10,483][00034] Fps is (10 sec: 1638.4, 60 sec: 1228.8, 300 sec: 1166.3). Total num frames: 1679360. Throughput: 0: 286.4. Samples: 420027. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 06:08:10,485][00034] Avg episode reward: [(0, '-6.467')] [2024-08-05 06:08:15,483][00034] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1138.5). Total num frames: 1679360. Throughput: 0: 286.1. Samples: 420873. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 06:08:15,485][00034] Avg episode reward: [(0, '-6.467')] [2024-08-05 06:08:20,483][00034] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1138.5). Total num frames: 1687552. Throughput: 0: 286.6. Samples: 422590. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 06:08:20,487][00034] Avg episode reward: [(0, '-6.467')] [2024-08-05 06:08:24,568][00139] DAMAGECOUNT value on done: 6329.0 [2024-08-05 06:08:24,569][00139] Sum rewards: -3.445, reward structure: {'DEATHCOUNT': '-5.250', 'HEALTH': '-1.575', 'FRAGCOUNT': '-0.500', 'AMMO2': '0.001', 'AMMO5': '0.005', 'AMMO4': '0.006', 'weapon5': '0.008', 'weapon4': '0.032', 'WEAPON1': '0.040', 'AMMO3': '0.091', 'WEAPON4': '0.100', 'WEAPON5': '0.100', 'ARMOR': '0.100', 'HITCOUNT': '0.110', 'WEAPON3': '0.450', 'DAMAGECOUNT': '0.555', 'weapon3': '1.038', 'weapon2': '1.244'} [2024-08-05 06:08:24,799][00139] DAMAGECOUNT value on done: 6912.0 [2024-08-05 06:08:24,799][00139] Sum rewards: -2.421, reward structure: {'DEATHCOUNT': '-5.250', 'HEALTH': '-1.650', 'AMMO5': '0.005', 'AMMO2': '0.005', 'HITCOUNT': '0.020', 'AMMO4': '0.027', 'ARMOR': '0.039', 'weapon5': '0.042', 'WEAPON5': '0.050', 'AMMO3': '0.076', 'WEAPON4': '0.200', 'weapon4': '0.220', 'DAMAGECOUNT': '0.387', 'WEAPON3': '0.400', 'weapon3': '0.528', 'FRAGCOUNT': '1.000', 'weapon2': '1.480'} [2024-08-05 06:08:25,483][00034] Fps is (10 sec: 1638.4, 60 sec: 1228.8, 300 sec: 1138.5). Total num frames: 1695744. Throughput: 0: 285.0. Samples: 424283. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 06:08:25,484][00034] Avg episode reward: [(0, '-6.412')] [2024-08-05 06:08:25,491][00132] Saving new best policy, reward=-6.412! [2024-08-05 06:08:30,483][00034] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1138.6). Total num frames: 1695744. Throughput: 0: 284.7. Samples: 425128. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 06:08:30,485][00034] Avg episode reward: [(0, '-6.412')] [2024-08-05 06:08:35,483][00034] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1138.5). Total num frames: 1703936. Throughput: 0: 285.9. Samples: 426896. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 06:08:35,484][00034] Avg episode reward: [(0, '-6.412')] [2024-08-05 06:08:39,204][00139] DAMAGECOUNT value on done: 6389.0 [2024-08-05 06:08:39,432][00139] DAMAGECOUNT value on done: 7010.0 [2024-08-05 06:08:40,483][00034] Fps is (10 sec: 1638.4, 60 sec: 1228.8, 300 sec: 1138.5). Total num frames: 1712128. Throughput: 0: 284.8. Samples: 428593. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 06:08:40,484][00034] Avg episode reward: [(0, '-6.447')] [2024-08-05 06:08:45,435][00138] Updated weights for policy 0, policy_version 210 (0.0017) [2024-08-05 06:08:45,483][00034] Fps is (10 sec: 1638.4, 60 sec: 1228.8, 300 sec: 1166.3). Total num frames: 1720320. Throughput: 0: 285.2. Samples: 429446. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 06:08:45,484][00034] Avg episode reward: [(0, '-6.447')] [2024-08-05 06:08:50,483][00034] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1138.5). Total num frames: 1720320. Throughput: 0: 286.3. Samples: 431183. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 06:08:50,484][00034] Avg episode reward: [(0, '-6.447')] [2024-08-05 06:08:53,925][00139] DAMAGECOUNT value on done: 6519.0 [2024-08-05 06:08:54,156][00139] DAMAGECOUNT value on done: 7104.0 [2024-08-05 06:08:54,157][00139] Sum rewards: -5.442, reward structure: {'DEATHCOUNT': '-9.000', 'HEALTH': '-1.010', 'AMMO4': '-0.003', 'AMMO2': '-0.001', 'AMMO5': '0.015', 'ARMOR': '0.075', 'AMMO3': '0.098', 'HITCOUNT': '0.100', 'weapon5': '0.112', 'WEAPON5': '0.150', 'DAMAGECOUNT': '0.282', 'WEAPON3': '0.300', 'weapon3': '0.770', 'FRAGCOUNT': '1.000', 'weapon2': '1.670'} [2024-08-05 06:08:55,483][00034] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1138.5). Total num frames: 1728512. Throughput: 0: 285.5. Samples: 432873. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 06:08:55,484][00034] Avg episode reward: [(0, '-6.440')] [2024-08-05 06:09:00,483][00034] Fps is (10 sec: 1638.4, 60 sec: 1228.8, 300 sec: 1166.3). Total num frames: 1736704. Throughput: 0: 286.0. Samples: 433741. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 06:09:00,484][00034] Avg episode reward: [(0, '-6.440')] [2024-08-05 06:09:05,483][00034] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1138.5). Total num frames: 1736704. Throughput: 0: 284.9. Samples: 435410. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 06:09:05,485][00034] Avg episode reward: [(0, '-6.440')] [2024-08-05 06:09:08,716][00139] DAMAGECOUNT value on done: 6594.0 [2024-08-05 06:09:08,973][00139] DAMAGECOUNT value on done: 7154.0 [2024-08-05 06:09:10,483][00034] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1138.5). Total num frames: 1744896. Throughput: 0: 285.8. Samples: 437145. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 06:09:10,484][00034] Avg episode reward: [(0, '-6.507')] [2024-08-05 06:09:15,483][00034] Fps is (10 sec: 1638.4, 60 sec: 1228.8, 300 sec: 1166.3). Total num frames: 1753088. Throughput: 0: 287.1. Samples: 438048. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 06:09:15,485][00034] Avg episode reward: [(0, '-6.507')] [2024-08-05 06:09:20,483][00034] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1138.5). Total num frames: 1753088. Throughput: 0: 285.9. Samples: 439761. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 06:09:20,484][00034] Avg episode reward: [(0, '-6.507')] [2024-08-05 06:09:23,178][00139] DAMAGECOUNT value on done: 6769.0 [2024-08-05 06:09:23,179][00139] Sum rewards: -10.526, reward structure: {'DEATHCOUNT': '-12.750', 'HEALTH': '-1.921', 'FRAGCOUNT': '-0.500', 'AMMO2': '0.002', 'AMMO5': '0.003', 'ARMOR': '0.008', 'weapon5': '0.008', 'AMMO4': '0.008', 'WEAPON1': '0.020', 'weapon4': '0.024', 'WEAPON4': '0.050', 'WEAPON5': '0.050', 'AMMO3': '0.141', 'HITCOUNT': '0.160', 'DAMAGECOUNT': '0.525', 'WEAPON3': '0.850', 'weapon3': '1.208', 'weapon2': '1.588'} [2024-08-05 06:09:23,403][00139] DAMAGECOUNT value on done: 7179.0 [2024-08-05 06:09:23,403][00139] Sum rewards: -10.583, reward structure: {'DEATHCOUNT': '-11.250', 'HEALTH': '-1.870', 'FRAGCOUNT': '-1.500', 'AMMO5': '0.012', 'weapon5': '0.022', 'AMMO2': '0.024', 'HITCOUNT': '0.030', 'DAMAGECOUNT': '0.075', 'weapon4': '0.094', 'AMMO4': '0.122', 'WEAPON5': '0.150', 'WEAPON4': '0.150', 'AMMO3': '0.199', 'weapon3': '0.708', 'WEAPON3': '0.800', 'weapon2': '1.650'} [2024-08-05 06:09:25,483][00034] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1138.5). Total num frames: 1761280. Throughput: 0: 286.5. Samples: 441484. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 06:09:25,485][00034] Avg episode reward: [(0, '-6.530')] [2024-08-05 06:09:30,483][00034] Fps is (10 sec: 1638.4, 60 sec: 1228.8, 300 sec: 1138.5). Total num frames: 1769472. Throughput: 0: 286.6. Samples: 442343. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 06:09:30,486][00034] Avg episode reward: [(0, '-6.530')] [2024-08-05 06:09:35,483][00034] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1138.5). Total num frames: 1769472. Throughput: 0: 285.6. Samples: 444036. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 06:09:35,484][00034] Avg episode reward: [(0, '-6.530')] [2024-08-05 06:09:35,529][00132] Saving /kaggle/working/train_dir/default_experiment/checkpoint_p0/checkpoint_000000217_1777664.pth... [2024-08-05 06:09:35,603][00132] Removing /kaggle/working/train_dir/default_experiment/checkpoint_p0/checkpoint_000000183_1499136.pth [2024-08-05 06:09:37,901][00139] DAMAGECOUNT value on done: 6823.0 [2024-08-05 06:09:38,133][00139] DAMAGECOUNT value on done: 7264.0 [2024-08-05 06:09:38,133][00139] Sum rewards: -5.522, reward structure: {'DEATHCOUNT': '-8.250', 'HEALTH': '-1.830', 'AMMO4': '-0.031', 'AMMO2': '-0.006', 'weapon5': '0.002', 'AMMO5': '0.010', 'ARMOR': '0.024', 'HITCOUNT': '0.090', 'WEAPON5': '0.100', 'AMMO3': '0.102', 'DAMAGECOUNT': '0.255', 'WEAPON3': '0.500', 'weapon3': '0.980', 'FRAGCOUNT': '1.000', 'weapon2': '1.532'} [2024-08-05 06:09:40,483][00034] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1138.5). Total num frames: 1777664. Throughput: 0: 286.5. Samples: 445765. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 06:09:40,485][00034] Avg episode reward: [(0, '-6.525')] [2024-08-05 06:09:45,483][00034] Fps is (10 sec: 1638.4, 60 sec: 1092.3, 300 sec: 1138.5). Total num frames: 1785856. Throughput: 0: 287.0. Samples: 446654. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 06:09:45,485][00034] Avg episode reward: [(0, '-6.525')] [2024-08-05 06:09:50,483][00034] Fps is (10 sec: 1638.4, 60 sec: 1228.8, 300 sec: 1166.3). Total num frames: 1794048. Throughput: 0: 288.3. Samples: 448385. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 06:09:50,484][00034] Avg episode reward: [(0, '-6.525')] [2024-08-05 06:09:52,459][00139] DAMAGECOUNT value on done: 7025.0 [2024-08-05 06:09:52,460][00139] Sum rewards: -1.461, reward structure: {'DEATHCOUNT': '-6.750', 'HEALTH': '-1.090', 'AMMO2': '0.009', 'AMMO4': '0.043', 'weapon4': '0.066', 'ARMOR': '0.072', 'AMMO3': '0.115', 'WEAPON4': '0.150', 'HITCOUNT': '0.180', 'WEAPON3': '0.550', 'DAMAGECOUNT': '0.606', 'weapon3': '1.280', 'weapon2': '1.308', 'FRAGCOUNT': '2.000'} [2024-08-05 06:09:52,684][00139] DAMAGECOUNT value on done: 7364.0 [2024-08-05 06:09:52,685][00139] Sum rewards: -3.793, reward structure: {'DEATHCOUNT': '-8.250', 'HEALTH': '-1.316', 'AMMO5': '0.003', 'AMMO2': '0.023', 'weapon5': '0.046', 'WEAPON5': '0.050', 'HITCOUNT': '0.090', 'AMMO3': '0.100', 'weapon4': '0.114', 'AMMO4': '0.117', 'DAMAGECOUNT': '0.300', 'WEAPON4': '0.300', 'ARMOR': '0.424', 'WEAPON3': '0.550', 'weapon3': '0.894', 'FRAGCOUNT': '1.000', 'weapon2': '1.762'} [2024-08-05 06:09:55,483][00034] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1138.5). Total num frames: 1794048. Throughput: 0: 288.2. Samples: 450112. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 06:09:55,484][00034] Avg episode reward: [(0, '-6.374')] [2024-08-05 06:09:55,491][00132] Saving new best policy, reward=-6.374! [2024-08-05 06:09:56,754][00138] Updated weights for policy 0, policy_version 220 (0.0018) [2024-08-05 06:10:00,483][00034] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1138.5). Total num frames: 1802240. Throughput: 0: 287.5. Samples: 450986. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 06:10:00,485][00034] Avg episode reward: [(0, '-6.374')] [2024-08-05 06:10:05,483][00034] Fps is (10 sec: 1638.4, 60 sec: 1228.8, 300 sec: 1166.3). Total num frames: 1810432. Throughput: 0: 286.6. Samples: 452659. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 06:10:05,484][00034] Avg episode reward: [(0, '-6.374')] [2024-08-05 06:10:07,030][00139] DAMAGECOUNT value on done: 7118.0 [2024-08-05 06:10:07,274][00139] DAMAGECOUNT value on done: 7494.0 [2024-08-05 06:10:07,274][00139] Sum rewards: -4.168, reward structure: {'DEATHCOUNT': '-9.000', 'HEALTH': '-0.410', 'AMMO2': '0.009', 'AMMO4': '0.046', 'ARMOR': '0.048', 'weapon4': '0.050', 'HITCOUNT': '0.090', 'WEAPON4': '0.100', 'AMMO3': '0.132', 'DAMAGECOUNT': '0.390', 'WEAPON3': '0.700', 'FRAGCOUNT': '1.000', 'weapon3': '1.320', 'weapon2': '1.356'} [2024-08-05 06:10:10,483][00034] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1138.5). Total num frames: 1810432. Throughput: 0: 288.1. Samples: 454447. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 06:10:10,484][00034] Avg episode reward: [(0, '-6.378')] [2024-08-05 06:10:15,483][00034] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1138.5). Total num frames: 1818624. Throughput: 0: 288.6. Samples: 455329. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 06:10:15,485][00034] Avg episode reward: [(0, '-6.378')] [2024-08-05 06:10:20,483][00034] Fps is (10 sec: 1638.4, 60 sec: 1228.8, 300 sec: 1138.5). Total num frames: 1826816. Throughput: 0: 289.9. Samples: 457083. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 06:10:20,485][00034] Avg episode reward: [(0, '-6.378')] [2024-08-05 06:10:21,359][00139] DAMAGECOUNT value on done: 7153.0 [2024-08-05 06:10:21,598][00139] DAMAGECOUNT value on done: 7499.0 [2024-08-05 06:10:21,599][00139] Sum rewards: -5.808, reward structure: {'DEATHCOUNT': '-8.250', 'FRAGCOUNT': '-1.500', 'AMMO5': '0.010', 'HITCOUNT': '0.010', 'WEAPON1': '0.010', 'DAMAGECOUNT': '0.015', 'AMMO2': '0.018', 'weapon4': '0.066', 'AMMO3': '0.088', 'AMMO4': '0.091', 'WEAPON4': '0.100', 'weapon5': '0.144', 'WEAPON5': '0.200', 'HEALTH': '0.458', 'WEAPON3': '0.500', 'weapon3': '0.778', 'weapon2': '1.454'} [2024-08-05 06:10:25,483][00034] Fps is (10 sec: 1638.4, 60 sec: 1228.8, 300 sec: 1166.3). Total num frames: 1835008. Throughput: 0: 290.4. Samples: 458831. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 06:10:25,484][00034] Avg episode reward: [(0, '-6.365')] [2024-08-05 06:10:25,491][00132] Saving new best policy, reward=-6.365! [2024-08-05 06:10:30,483][00034] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1138.5). Total num frames: 1835008. Throughput: 0: 290.8. Samples: 459739. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 06:10:30,485][00034] Avg episode reward: [(0, '-6.365')] [2024-08-05 06:10:35,483][00034] Fps is (10 sec: 819.2, 60 sec: 1228.8, 300 sec: 1138.5). Total num frames: 1843200. Throughput: 0: 289.8. Samples: 461424. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 06:10:35,485][00034] Avg episode reward: [(0, '-6.365')] [2024-08-05 06:10:35,807][00139] DAMAGECOUNT value on done: 7368.0 [2024-08-05 06:10:35,808][00139] Sum rewards: -5.301, reward structure: {'DEATHCOUNT': '-9.750', 'HEALTH': '-1.046', 'AMMO2': '0.004', 'AMMO4': '0.022', 'ARMOR': '0.072', 'AMMO3': '0.144', 'HITCOUNT': '0.210', 'DAMAGECOUNT': '0.645', 'WEAPON3': '0.800', 'FRAGCOUNT': '1.000', 'weapon2': '1.162', 'weapon3': '1.436'} [2024-08-05 06:10:36,038][00139] DAMAGECOUNT value on done: 7625.0 [2024-08-05 06:10:36,039][00139] Sum rewards: -8.493, reward structure: {'DEATHCOUNT': '-11.250', 'HEALTH': '-3.312', 'AMMO2': '0.021', 'HITCOUNT': '0.080', 'AMMO4': '0.103', 'weapon4': '0.120', 'AMMO3': '0.168', 'WEAPON4': '0.250', 'DAMAGECOUNT': '0.378', 'ARMOR': '0.547', 'weapon3': '0.776', 'WEAPON3': '0.800', 'FRAGCOUNT': '1.000', 'weapon2': '1.826'} [2024-08-05 06:10:40,483][00034] Fps is (10 sec: 1638.4, 60 sec: 1228.8, 300 sec: 1166.3). Total num frames: 1851392. Throughput: 0: 290.3. Samples: 463177. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 06:10:40,484][00034] Avg episode reward: [(0, '-6.329')] [2024-08-05 06:10:40,486][00132] Saving new best policy, reward=-6.329! [2024-08-05 06:10:45,483][00034] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1138.5). Total num frames: 1851392. Throughput: 0: 289.7. Samples: 464022. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 06:10:45,485][00034] Avg episode reward: [(0, '-6.329')] [2024-08-05 06:10:50,483][00034] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1138.5). Total num frames: 1859584. Throughput: 0: 290.0. Samples: 465710. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 06:10:50,485][00034] Avg episode reward: [(0, '-6.329')] [2024-08-05 06:10:50,559][00139] DAMAGECOUNT value on done: 7498.0 [2024-08-05 06:10:50,559][00139] Sum rewards: -4.257, reward structure: {'DEATHCOUNT': '-9.750', 'HEALTH': '-0.490', 'AMMO5': '0.004', 'ARMOR': '0.021', 'AMMO2': '0.029', 'weapon5': '0.042', 'weapon4': '0.046', 'HITCOUNT': '0.070', 'WEAPON5': '0.100', 'AMMO3': '0.105', 'AMMO4': '0.145', 'WEAPON4': '0.200', 'DAMAGECOUNT': '0.390', 'WEAPON3': '0.500', 'weapon3': '0.998', 'weapon2': '1.332', 'FRAGCOUNT': '2.000'} [2024-08-05 06:10:50,785][00139] DAMAGECOUNT value on done: 7649.0 [2024-08-05 06:10:55,483][00034] Fps is (10 sec: 1638.4, 60 sec: 1228.8, 300 sec: 1166.3). Total num frames: 1867776. Throughput: 0: 288.7. Samples: 467440. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 06:10:55,487][00034] Avg episode reward: [(0, '-6.374')] [2024-08-05 06:11:00,483][00034] Fps is (10 sec: 1638.4, 60 sec: 1228.8, 300 sec: 1166.3). Total num frames: 1875968. Throughput: 0: 289.2. Samples: 468342. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 06:11:00,484][00034] Avg episode reward: [(0, '-6.374')] [2024-08-05 06:11:05,100][00139] DAMAGECOUNT value on done: 7772.0 [2024-08-05 06:11:05,100][00139] Sum rewards: -3.748, reward structure: {'DEATHCOUNT': '-8.250', 'HEALTH': '-1.285', 'AMMO2': '0.000', 'AMMO4': '0.001', 'AMMO3': '0.123', 'HITCOUNT': '0.170', 'ARMOR': '0.498', 'WEAPON3': '0.550', 'DAMAGECOUNT': '0.822', 'FRAGCOUNT': '1.000', 'weapon3': '1.054', 'weapon2': '1.568'} [2024-08-05 06:11:05,350][00139] DAMAGECOUNT value on done: 7649.0 [2024-08-05 06:11:05,483][00034] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1138.5). Total num frames: 1875968. Throughput: 0: 288.9. Samples: 470082. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 06:11:05,486][00034] Avg episode reward: [(0, '-6.422')] [2024-08-05 06:11:07,719][00138] Updated weights for policy 0, policy_version 230 (0.0017) [2024-08-05 06:11:10,483][00034] Fps is (10 sec: 819.2, 60 sec: 1228.8, 300 sec: 1138.5). Total num frames: 1884160. Throughput: 0: 286.9. Samples: 471742. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 06:11:10,484][00034] Avg episode reward: [(0, '-6.422')] [2024-08-05 06:11:15,483][00034] Fps is (10 sec: 1638.4, 60 sec: 1228.8, 300 sec: 1166.3). Total num frames: 1892352. Throughput: 0: 286.3. Samples: 472623. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 06:11:15,485][00034] Avg episode reward: [(0, '-6.422')] [2024-08-05 06:11:19,735][00139] DAMAGECOUNT value on done: 7787.0 [2024-08-05 06:11:19,957][00139] DAMAGECOUNT value on done: 7674.0 [2024-08-05 06:11:20,483][00034] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1138.5). Total num frames: 1892352. Throughput: 0: 287.5. Samples: 474362. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 06:11:20,484][00034] Avg episode reward: [(0, '-6.439')] [2024-08-05 06:11:25,483][00034] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1138.5). Total num frames: 1900544. Throughput: 0: 286.6. Samples: 476072. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 06:11:25,484][00034] Avg episode reward: [(0, '-6.439')] [2024-08-05 06:11:30,483][00034] Fps is (10 sec: 1638.4, 60 sec: 1228.8, 300 sec: 1166.3). Total num frames: 1908736. Throughput: 0: 287.6. Samples: 476963. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 06:11:30,484][00034] Avg episode reward: [(0, '-6.439')] [2024-08-05 06:11:34,141][00139] DAMAGECOUNT value on done: 7927.0 [2024-08-05 06:11:34,141][00139] Sum rewards: -3.922, reward structure: {'DEATHCOUNT': '-9.000', 'HEALTH': '-1.134', 'AMMO2': '0.006', 'weapon5': '0.016', 'AMMO5': '0.020', 'AMMO4': '0.029', 'AMMO3': '0.088', 'HITCOUNT': '0.110', 'weapon4': '0.136', 'WEAPON4': '0.150', 'WEAPON5': '0.300', 'WEAPON3': '0.400', 'DAMAGECOUNT': '0.420', 'ARMOR': '0.949', 'FRAGCOUNT': '1.000', 'weapon3': '1.006', 'weapon2': '1.582'} [2024-08-05 06:11:34,362][00139] DAMAGECOUNT value on done: 7894.0 [2024-08-05 06:11:34,363][00139] Sum rewards: -7.643, reward structure: {'DEATHCOUNT': '-12.750', 'HEALTH': '-1.820', 'AMMO5': '0.005', 'AMMO2': '0.021', 'ARMOR': '0.033', 'weapon4': '0.066', 'WEAPON5': '0.100', 'AMMO4': '0.104', 'WEAPON4': '0.150', 'AMMO3': '0.186', 'HITCOUNT': '0.190', 'DAMAGECOUNT': '0.660', 'WEAPON3': '0.900', 'weapon2': '1.246', 'weapon3': '1.266', 'FRAGCOUNT': '2.000'} [2024-08-05 06:11:35,483][00034] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1138.5). Total num frames: 1908736. Throughput: 0: 289.4. Samples: 478733. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 06:11:35,484][00034] Avg episode reward: [(0, '-6.410')] [2024-08-05 06:11:35,492][00132] Saving /kaggle/working/train_dir/default_experiment/checkpoint_p0/checkpoint_000000233_1908736.pth... [2024-08-05 06:11:35,563][00132] Removing /kaggle/working/train_dir/default_experiment/checkpoint_p0/checkpoint_000000200_1638400.pth [2024-08-05 06:11:40,483][00034] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1138.5). Total num frames: 1916928. Throughput: 0: 287.5. Samples: 480379. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 06:11:40,485][00034] Avg episode reward: [(0, '-6.410')] [2024-08-05 06:11:45,483][00034] Fps is (10 sec: 1638.4, 60 sec: 1228.8, 300 sec: 1166.3). Total num frames: 1925120. Throughput: 0: 287.4. Samples: 481275. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 06:11:45,484][00034] Avg episode reward: [(0, '-6.410')] [2024-08-05 06:11:48,964][00139] DAMAGECOUNT value on done: 8077.0 [2024-08-05 06:11:48,965][00139] Sum rewards: -8.829, reward structure: {'DEATHCOUNT': '-13.500', 'HEALTH': '-1.138', 'AMMO2': '0.010', 'AMMO4': '0.048', 'weapon4': '0.076', 'WEAPON4': '0.100', 'HITCOUNT': '0.140', 'AMMO3': '0.189', 'DAMAGECOUNT': '0.450', 'FRAGCOUNT': '1.000', 'WEAPON3': '1.050', 'weapon2': '1.360', 'weapon3': '1.386'} [2024-08-05 06:11:49,211][00139] DAMAGECOUNT value on done: 7985.0 [2024-08-05 06:11:49,211][00139] Sum rewards: -2.746, reward structure: {'DEATHCOUNT': '-6.750', 'HEALTH': '-0.860', 'AMMO5': '0.003', 'ARMOR': '0.004', 'AMMO2': '0.010', 'weapon5': '0.014', 'AMMO4': '0.047', 'WEAPON5': '0.050', 'HITCOUNT': '0.090', 'weapon4': '0.112', 'AMMO3': '0.123', 'WEAPON4': '0.200', 'DAMAGECOUNT': '0.273', 'WEAPON3': '0.600', 'FRAGCOUNT': '1.000', 'weapon2': '1.120', 'weapon3': '1.218'} [2024-08-05 06:11:50,483][00034] Fps is (10 sec: 1638.4, 60 sec: 1228.8, 300 sec: 1166.3). Total num frames: 1933312. Throughput: 0: 286.6. Samples: 482977. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 06:11:50,485][00034] Avg episode reward: [(0, '-6.340')] [2024-08-05 06:11:55,483][00034] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1138.5). Total num frames: 1933312. Throughput: 0: 287.8. Samples: 484691. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 06:11:55,484][00034] Avg episode reward: [(0, '-6.340')] [2024-08-05 06:12:00,483][00034] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1138.5). Total num frames: 1941504. Throughput: 0: 286.6. Samples: 485521. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 06:12:00,485][00034] Avg episode reward: [(0, '-6.340')] [2024-08-05 06:12:03,638][00139] DAMAGECOUNT value on done: 8194.0 [2024-08-05 06:12:03,638][00139] Sum rewards: -7.492, reward structure: {'DEATHCOUNT': '-12.000', 'HEALTH': '-1.046', 'weapon5': '0.004', 'AMMO2': '0.012', 'AMMO5': '0.015', 'AMMO4': '0.062', 'HITCOUNT': '0.090', 'weapon4': '0.128', 'WEAPON4': '0.150', 'AMMO3': '0.178', 'WEAPON5': '0.200', 'DAMAGECOUNT': '0.351', 'WEAPON3': '0.850', 'weapon2': '0.882', 'FRAGCOUNT': '1.000', 'weapon3': '1.632'} [2024-08-05 06:12:03,849][00139] DAMAGECOUNT value on done: 8025.0 [2024-08-05 06:12:05,483][00034] Fps is (10 sec: 1638.4, 60 sec: 1228.8, 300 sec: 1166.3). Total num frames: 1949696. Throughput: 0: 286.9. Samples: 487273. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 06:12:05,484][00034] Avg episode reward: [(0, '-6.374')] [2024-08-05 06:12:10,483][00034] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1138.5). Total num frames: 1949696. Throughput: 0: 285.6. Samples: 488924. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 06:12:10,484][00034] Avg episode reward: [(0, '-6.374')] [2024-08-05 06:12:15,483][00034] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1138.5). Total num frames: 1957888. Throughput: 0: 284.7. Samples: 489776. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 06:12:15,485][00034] Avg episode reward: [(0, '-6.374')] [2024-08-05 06:12:18,427][00139] DAMAGECOUNT value on done: 8244.0 [2024-08-05 06:12:18,652][00139] DAMAGECOUNT value on done: 8220.0 [2024-08-05 06:12:18,653][00139] Sum rewards: -2.512, reward structure: {'DEATHCOUNT': '-6.000', 'HEALTH': '-2.009', 'AMMO4': '-0.039', 'AMMO2': '-0.008', 'AMMO5': '0.005', 'AMMO3': '0.117', 'HITCOUNT': '0.140', 'ARMOR': '0.508', 'DAMAGECOUNT': '0.585', 'WEAPON3': '0.700', 'FRAGCOUNT': '1.000', 'weapon2': '1.000', 'weapon3': '1.488'} [2024-08-05 06:12:19,179][00138] Updated weights for policy 0, policy_version 240 (0.0017) [2024-08-05 06:12:20,483][00034] Fps is (10 sec: 1638.4, 60 sec: 1228.8, 300 sec: 1166.3). Total num frames: 1966080. Throughput: 0: 284.3. Samples: 491528. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 06:12:20,484][00034] Avg episode reward: [(0, '-6.359')] [2024-08-05 06:12:25,483][00034] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1138.5). Total num frames: 1966080. Throughput: 0: 287.4. Samples: 493310. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 06:12:25,487][00034] Avg episode reward: [(0, '-6.359')] [2024-08-05 06:12:30,483][00034] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1138.5). Total num frames: 1974272. Throughput: 0: 286.7. Samples: 494176. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 06:12:30,484][00034] Avg episode reward: [(0, '-6.359')] [2024-08-05 06:12:32,791][00139] DAMAGECOUNT value on done: 8309.0 [2024-08-05 06:12:33,029][00139] DAMAGECOUNT value on done: 8385.0 [2024-08-05 06:12:33,029][00139] Sum rewards: -8.727, reward structure: {'DEATHCOUNT': '-11.250', 'HEALTH': '-1.986', 'FRAGCOUNT': '-0.500', 'AMMO5': '0.007', 'weapon5': '0.014', 'AMMO2': '0.018', 'WEAPON1': '0.030', 'ARMOR': '0.044', 'weapon4': '0.086', 'AMMO4': '0.090', 'HITCOUNT': '0.110', 'WEAPON5': '0.150', 'AMMO3': '0.156', 'WEAPON4': '0.200', 'DAMAGECOUNT': '0.495', 'WEAPON3': '0.800', 'weapon3': '0.988', 'weapon2': '1.820'} [2024-08-05 06:12:35,483][00034] Fps is (10 sec: 1638.4, 60 sec: 1228.8, 300 sec: 1166.3). Total num frames: 1982464. Throughput: 0: 286.9. Samples: 495888. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 06:12:35,484][00034] Avg episode reward: [(0, '-6.314')] [2024-08-05 06:12:35,493][00132] Saving new best policy, reward=-6.314! [2024-08-05 06:12:40,483][00034] Fps is (10 sec: 1638.4, 60 sec: 1228.8, 300 sec: 1166.3). Total num frames: 1990656. Throughput: 0: 287.0. Samples: 497607. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 06:12:40,486][00034] Avg episode reward: [(0, '-6.314')] [2024-08-05 06:12:45,483][00034] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1138.5). Total num frames: 1990656. Throughput: 0: 287.6. Samples: 498463. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 06:12:45,484][00034] Avg episode reward: [(0, '-6.314')] [2024-08-05 06:12:47,587][00139] DAMAGECOUNT value on done: 8384.0 [2024-08-05 06:12:47,587][00139] Sum rewards: -7.272, reward structure: {'DEATHCOUNT': '-11.250', 'HEALTH': '-1.219', 'AMMO5': '0.013', 'AMMO2': '0.015', 'weapon5': '0.016', 'ARMOR': '0.036', 'weapon4': '0.042', 'HITCOUNT': '0.070', 'AMMO4': '0.076', 'AMMO3': '0.112', 'WEAPON4': '0.200', 'DAMAGECOUNT': '0.225', 'WEAPON5': '0.250', 'WEAPON3': '0.650', 'FRAGCOUNT': '1.000', 'weapon3': '1.060', 'weapon2': '1.432'} [2024-08-05 06:12:47,823][00139] DAMAGECOUNT value on done: 8505.0 [2024-08-05 06:12:50,483][00034] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1138.5). Total num frames: 1998848. Throughput: 0: 286.5. Samples: 500164. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 06:12:50,484][00034] Avg episode reward: [(0, '-6.307')] [2024-08-05 06:12:50,486][00132] Saving new best policy, reward=-6.307! [2024-08-05 06:12:55,483][00034] Fps is (10 sec: 1638.4, 60 sec: 1228.8, 300 sec: 1166.3). Total num frames: 2007040. Throughput: 0: 287.8. Samples: 501874. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 06:12:55,485][00034] Avg episode reward: [(0, '-6.307')] [2024-08-05 06:13:00,483][00034] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1138.5). Total num frames: 2007040. Throughput: 0: 288.6. Samples: 502763. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 06:13:00,484][00034] Avg episode reward: [(0, '-6.307')] [2024-08-05 06:13:02,329][00139] DAMAGECOUNT value on done: 8584.0 [2024-08-05 06:13:02,330][00139] Sum rewards: -0.959, reward structure: {'DEATHCOUNT': '-6.750', 'HEALTH': '-0.749', 'AMMO5': '0.007', 'AMMO2': '0.015', 'WEAPON1': '0.020', 'weapon5': '0.052', 'weapon4': '0.068', 'AMMO4': '0.076', 'ARMOR': '0.080', 'HITCOUNT': '0.080', 'AMMO3': '0.105', 'WEAPON4': '0.150', 'WEAPON5': '0.200', 'WEAPON3': '0.600', 'DAMAGECOUNT': '0.600', 'weapon3': '1.176', 'weapon2': '1.310', 'FRAGCOUNT': '2.000'} [2024-08-05 06:13:02,565][00139] DAMAGECOUNT value on done: 8610.0 [2024-08-05 06:13:05,483][00034] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1138.5). Total num frames: 2015232. Throughput: 0: 287.2. Samples: 504452. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 06:13:05,485][00034] Avg episode reward: [(0, '-6.240')] [2024-08-05 06:13:05,492][00132] Saving new best policy, reward=-6.240! [2024-08-05 06:13:10,483][00034] Fps is (10 sec: 1638.4, 60 sec: 1228.8, 300 sec: 1166.3). Total num frames: 2023424. Throughput: 0: 285.3. Samples: 506150. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 06:13:10,485][00034] Avg episode reward: [(0, '-6.240')] [2024-08-05 06:13:15,483][00034] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1138.5). Total num frames: 2023424. Throughput: 0: 284.8. Samples: 506991. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 06:13:15,484][00034] Avg episode reward: [(0, '-6.240')] [2024-08-05 06:13:17,132][00139] DAMAGECOUNT value on done: 8636.0 [2024-08-05 06:13:17,359][00139] DAMAGECOUNT value on done: 8784.0 [2024-08-05 06:13:17,360][00139] Sum rewards: -8.426, reward structure: {'DEATHCOUNT': '-12.000', 'HEALTH': '-1.302', 'AMMO5': '0.003', 'ARMOR': '0.004', 'WEAPON1': '0.010', 'AMMO2': '0.020', 'WEAPON5': '0.050', 'WEAPON4': '0.050', 'weapon4': '0.060', 'weapon5': '0.092', 'AMMO4': '0.100', 'HITCOUNT': '0.140', 'AMMO3': '0.159', 'FRAGCOUNT': '0.500', 'DAMAGECOUNT': '0.522', 'WEAPON3': '0.850', 'weapon2': '1.084', 'weapon3': '1.232'} [2024-08-05 06:13:20,483][00034] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1138.5). Total num frames: 2031616. Throughput: 0: 284.6. Samples: 508693. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 06:13:20,484][00034] Avg episode reward: [(0, '-6.281')] [2024-08-05 06:13:25,483][00034] Fps is (10 sec: 1638.4, 60 sec: 1228.8, 300 sec: 1166.3). Total num frames: 2039808. Throughput: 0: 284.5. Samples: 510409. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 06:13:25,485][00034] Avg episode reward: [(0, '-6.281')] [2024-08-05 06:13:30,483][00034] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1138.5). Total num frames: 2039808. Throughput: 0: 285.5. Samples: 511310. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 06:13:30,484][00034] Avg episode reward: [(0, '-6.281')] [2024-08-05 06:13:30,606][00138] Updated weights for policy 0, policy_version 250 (0.0017) [2024-08-05 06:13:31,727][00139] DAMAGECOUNT value on done: 8964.0 [2024-08-05 06:13:31,727][00139] Sum rewards: -3.538, reward structure: {'DEATHCOUNT': '-9.000', 'HEALTH': '-1.665', 'AMMO5': '0.003', 'WEAPON1': '0.010', 'AMMO2': '0.015', 'weapon4': '0.028', 'ARMOR': '0.044', 'WEAPON5': '0.050', 'weapon5': '0.062', 'AMMO4': '0.076', 'AMMO3': '0.115', 'WEAPON4': '0.200', 'HITCOUNT': '0.250', 'WEAPON3': '0.700', 'DAMAGECOUNT': '0.984', 'weapon2': '1.230', 'weapon3': '1.360', 'FRAGCOUNT': '2.000'} [2024-08-05 06:13:31,960][00139] DAMAGECOUNT value on done: 9024.0 [2024-08-05 06:13:31,961][00139] Sum rewards: -7.275, reward structure: {'DEATHCOUNT': '-12.000', 'HEALTH': '-2.175', 'AMMO5': '0.003', 'AMMO2': '0.010', 'weapon4': '0.014', 'weapon5': '0.024', 'ARMOR': '0.048', 'AMMO4': '0.049', 'WEAPON5': '0.050', 'WEAPON4': '0.100', 'HITCOUNT': '0.130', 'AMMO3': '0.202', 'DAMAGECOUNT': '0.720', 'WEAPON3': '1.000', 'weapon2': '1.040', 'weapon3': '1.510', 'FRAGCOUNT': '2.000'} [2024-08-05 06:13:35,483][00034] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1138.5). Total num frames: 2048000. Throughput: 0: 285.7. Samples: 513022. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 06:13:35,485][00034] Avg episode reward: [(0, '-6.310')] [2024-08-05 06:13:35,494][00132] Saving /kaggle/working/train_dir/default_experiment/checkpoint_p0/checkpoint_000000250_2048000.pth... [2024-08-05 06:13:35,564][00132] Removing /kaggle/working/train_dir/default_experiment/checkpoint_p0/checkpoint_000000217_1777664.pth [2024-08-05 06:13:40,483][00034] Fps is (10 sec: 1638.4, 60 sec: 1092.3, 300 sec: 1138.5). Total num frames: 2056192. Throughput: 0: 286.2. Samples: 514753. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 06:13:40,484][00034] Avg episode reward: [(0, '-6.310')] [2024-08-05 06:13:45,483][00034] Fps is (10 sec: 1638.4, 60 sec: 1228.8, 300 sec: 1166.3). Total num frames: 2064384. Throughput: 0: 284.8. Samples: 515578. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 06:13:45,485][00034] Avg episode reward: [(0, '-6.310')] [2024-08-05 06:13:46,522][00139] DAMAGECOUNT value on done: 9078.0 [2024-08-05 06:13:46,764][00139] DAMAGECOUNT value on done: 9239.0 [2024-08-05 06:13:46,766][00139] Sum rewards: -3.620, reward structure: {'DEATHCOUNT': '-7.500', 'HEALTH': '-1.954', 'AMMO5': '0.010', 'AMMO2': '0.014', 'WEAPON1': '0.020', 'weapon5': '0.024', 'ARMOR': '0.044', 'AMMO4': '0.069', 'AMMO3': '0.106', 'HITCOUNT': '0.140', 'WEAPON4': '0.150', 'weapon4': '0.152', 'WEAPON5': '0.200', 'DAMAGECOUNT': '0.645', 'WEAPON3': '0.700', 'FRAGCOUNT': '1.000', 'weapon2': '1.000', 'weapon3': '1.560'} [2024-08-05 06:13:50,483][00034] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1138.5). Total num frames: 2064384. Throughput: 0: 284.8. Samples: 517266. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 06:13:50,485][00034] Avg episode reward: [(0, '-6.258')] [2024-08-05 06:13:55,483][00034] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1138.5). Total num frames: 2072576. Throughput: 0: 285.4. Samples: 518992. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 06:13:55,484][00034] Avg episode reward: [(0, '-6.258')] [2024-08-05 06:14:00,483][00034] Fps is (10 sec: 1638.4, 60 sec: 1228.8, 300 sec: 1166.3). Total num frames: 2080768. Throughput: 0: 286.7. Samples: 519894. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 06:14:00,484][00034] Avg episode reward: [(0, '-6.258')] [2024-08-05 06:14:01,089][00139] DAMAGECOUNT value on done: 9305.0 [2024-08-05 06:14:01,090][00139] Sum rewards: -8.735, reward structure: {'DEATHCOUNT': '-12.000', 'HEALTH': '-1.392', 'FRAGCOUNT': '-0.500', 'AMMO5': '0.005', 'weapon5': '0.014', 'AMMO2': '0.036', 'WEAPON5': '0.050', 'ARMOR': '0.068', 'AMMO3': '0.133', 'AMMO4': '0.178', 'HITCOUNT': '0.180', 'weapon4': '0.224', 'WEAPON4': '0.300', 'DAMAGECOUNT': '0.681', 'WEAPON3': '0.700', 'weapon3': '1.194', 'weapon2': '1.394'} [2024-08-05 06:14:01,320][00139] DAMAGECOUNT value on done: 9311.0 [2024-08-05 06:14:05,483][00034] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1138.5). Total num frames: 2080768. Throughput: 0: 286.5. Samples: 521585. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 06:14:05,484][00034] Avg episode reward: [(0, '-6.220')] [2024-08-05 06:14:05,491][00132] Saving new best policy, reward=-6.220! [2024-08-05 06:14:10,483][00034] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1138.5). Total num frames: 2088960. Throughput: 0: 286.3. Samples: 523291. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 06:14:10,484][00034] Avg episode reward: [(0, '-6.220')] [2024-08-05 06:14:15,483][00034] Fps is (10 sec: 1638.4, 60 sec: 1228.8, 300 sec: 1166.3). Total num frames: 2097152. Throughput: 0: 285.9. Samples: 524176. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 06:14:15,484][00034] Avg episode reward: [(0, '-6.220')] [2024-08-05 06:14:16,051][00139] DAMAGECOUNT value on done: 9375.0 [2024-08-05 06:14:16,051][00139] Sum rewards: -5.161, reward structure: {'DEATHCOUNT': '-11.250', 'HEALTH': '-0.270', 'AMMO2': '0.011', 'AMMO4': '0.053', 'HITCOUNT': '0.070', 'WEAPON4': '0.100', 'weapon4': '0.118', 'AMMO3': '0.183', 'DAMAGECOUNT': '0.210', 'WEAPON3': '0.850', 'weapon2': '1.076', 'weapon3': '1.688', 'FRAGCOUNT': '2.000'} [2024-08-05 06:14:16,295][00139] DAMAGECOUNT value on done: 9400.0 [2024-08-05 06:14:20,483][00034] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1138.5). Total num frames: 2097152. Throughput: 0: 284.7. Samples: 525834. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 06:14:20,485][00034] Avg episode reward: [(0, '-6.250')] [2024-08-05 06:14:25,483][00034] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1138.5). Total num frames: 2105344. Throughput: 0: 284.7. Samples: 527564. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 06:14:25,485][00034] Avg episode reward: [(0, '-6.250')] [2024-08-05 06:14:30,483][00034] Fps is (10 sec: 1638.4, 60 sec: 1228.8, 300 sec: 1166.3). Total num frames: 2113536. Throughput: 0: 285.3. Samples: 528417. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 06:14:30,484][00034] Avg episode reward: [(0, '-6.250')] [2024-08-05 06:14:30,600][00139] DAMAGECOUNT value on done: 9530.0 [2024-08-05 06:14:30,601][00139] Sum rewards: -3.583, reward structure: {'DEATHCOUNT': '-7.500', 'FRAGCOUNT': '-0.500', 'HEALTH': '-0.040', 'AMMO2': '0.006', 'weapon4': '0.024', 'AMMO5': '0.025', 'AMMO4': '0.028', 'WEAPON1': '0.030', 'WEAPON4': '0.050', 'weapon5': '0.056', 'AMMO3': '0.097', 'HITCOUNT': '0.100', 'WEAPON5': '0.450', 'DAMAGECOUNT': '0.465', 'WEAPON3': '0.500', 'weapon2': '0.928', 'weapon3': '1.698'} [2024-08-05 06:14:30,843][00139] DAMAGECOUNT value on done: 9640.0 [2024-08-05 06:14:30,843][00139] Sum rewards: -4.578, reward structure: {'DEATHCOUNT': '-8.250', 'HEALTH': '-1.740', 'AMMO2': '0.009', 'WEAPON1': '0.010', 'ARMOR': '0.016', 'weapon4': '0.022', 'AMMO5': '0.024', 'AMMO4': '0.047', 'AMMO3': '0.095', 'HITCOUNT': '0.120', 'WEAPON4': '0.150', 'weapon5': '0.170', 'WEAPON5': '0.450', 'FRAGCOUNT': '0.500', 'WEAPON3': '0.500', 'DAMAGECOUNT': '0.720', 'weapon3': '1.200', 'weapon2': '1.378'} [2024-08-05 06:14:35,483][00034] Fps is (10 sec: 1638.4, 60 sec: 1228.8, 300 sec: 1166.3). Total num frames: 2121728. Throughput: 0: 286.0. Samples: 530137. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 06:14:35,484][00034] Avg episode reward: [(0, '-6.211')] [2024-08-05 06:14:35,491][00132] Saving new best policy, reward=-6.211! [2024-08-05 06:14:40,483][00034] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1138.5). Total num frames: 2121728. Throughput: 0: 286.6. Samples: 531888. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 06:14:40,485][00034] Avg episode reward: [(0, '-6.211')] [2024-08-05 06:14:42,309][00138] Updated weights for policy 0, policy_version 260 (0.0017) [2024-08-05 06:14:45,218][00139] DAMAGECOUNT value on done: 9924.0 [2024-08-05 06:14:45,218][00139] Sum rewards: 0.890, reward structure: {'DEATHCOUNT': '-8.250', 'HEALTH': '-1.016', 'AMMO5': '0.008', 'WEAPON1': '0.020', 'AMMO2': '0.031', 'weapon4': '0.046', 'weapon5': '0.052', 'AMMO3': '0.114', 'WEAPON4': '0.150', 'WEAPON5': '0.150', 'AMMO4': '0.153', 'HITCOUNT': '0.160', 'WEAPON3': '0.550', 'DAMAGECOUNT': '1.182', 'weapon3': '1.230', 'weapon2': '1.310', 'FRAGCOUNT': '5.000'} [2024-08-05 06:14:45,444][00139] DAMAGECOUNT value on done: 9645.0 [2024-08-05 06:14:45,483][00034] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1138.5). Total num frames: 2129920. Throughput: 0: 285.3. Samples: 532731. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 06:14:45,485][00034] Avg episode reward: [(0, '-6.120')] [2024-08-05 06:14:45,492][00132] Saving new best policy, reward=-6.120! [2024-08-05 06:14:50,483][00034] Fps is (10 sec: 1638.4, 60 sec: 1228.8, 300 sec: 1166.3). Total num frames: 2138112. Throughput: 0: 284.8. Samples: 534399. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 06:14:50,484][00034] Avg episode reward: [(0, '-6.120')] [2024-08-05 06:14:55,483][00034] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1138.5). Total num frames: 2138112. Throughput: 0: 285.7. Samples: 536149. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 06:14:55,484][00034] Avg episode reward: [(0, '-6.120')] [2024-08-05 06:14:59,964][00139] DAMAGECOUNT value on done: 10091.0 [2024-08-05 06:14:59,965][00139] Sum rewards: -4.136, reward structure: {'DEATHCOUNT': '-6.750', 'HEALTH': '-1.370', 'FRAGCOUNT': '-0.500', 'weapon4': '0.006', 'AMMO2': '0.014', 'AMMO5': '0.017', 'WEAPON1': '0.020', 'ARMOR': '0.026', 'weapon5': '0.034', 'AMMO4': '0.070', 'AMMO3': '0.084', 'HITCOUNT': '0.130', 'WEAPON4': '0.200', 'WEAPON5': '0.300', 'WEAPON3': '0.500', 'DAMAGECOUNT': '0.501', 'weapon2': '1.232', 'weapon3': '1.350'} [2024-08-05 06:15:00,200][00139] DAMAGECOUNT value on done: 9695.0 [2024-08-05 06:15:00,483][00034] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1138.5). Total num frames: 2146304. Throughput: 0: 285.1. Samples: 537004. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 06:15:00,484][00034] Avg episode reward: [(0, '-6.074')] [2024-08-05 06:15:00,488][00132] Saving new best policy, reward=-6.074! [2024-08-05 06:15:05,483][00034] Fps is (10 sec: 1638.4, 60 sec: 1228.8, 300 sec: 1166.3). Total num frames: 2154496. Throughput: 0: 286.3. Samples: 538719. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 06:15:05,484][00034] Avg episode reward: [(0, '-6.074')] [2024-08-05 06:15:10,483][00034] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1138.5). Total num frames: 2154496. Throughput: 0: 286.9. Samples: 540474. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 06:15:10,485][00034] Avg episode reward: [(0, '-6.074')] [2024-08-05 06:15:14,649][00139] DAMAGECOUNT value on done: 10096.0 [2024-08-05 06:15:14,879][00139] DAMAGECOUNT value on done: 9785.0 [2024-08-05 06:15:14,880][00139] Sum rewards: -7.800, reward structure: {'DEATHCOUNT': '-12.000', 'HEALTH': '-1.805', 'AMMO5': '0.007', 'weapon5': '0.010', 'ARMOR': '0.020', 'AMMO2': '0.037', 'HITCOUNT': '0.100', 'weapon4': '0.102', 'WEAPON5': '0.150', 'AMMO4': '0.183', 'AMMO3': '0.195', 'DAMAGECOUNT': '0.270', 'WEAPON4': '0.350', 'WEAPON3': '0.800', 'FRAGCOUNT': '1.000', 'weapon3': '1.340', 'weapon2': '1.440'} [2024-08-05 06:15:15,483][00034] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1138.5). Total num frames: 2162688. Throughput: 0: 286.2. Samples: 541298. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 06:15:15,484][00034] Avg episode reward: [(0, '-6.014')] [2024-08-05 06:15:15,490][00132] Saving new best policy, reward=-6.014! [2024-08-05 06:15:20,483][00034] Fps is (10 sec: 1638.4, 60 sec: 1228.8, 300 sec: 1138.5). Total num frames: 2170880. Throughput: 0: 285.3. Samples: 542976. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 06:15:20,484][00034] Avg episode reward: [(0, '-6.014')] [2024-08-05 06:15:25,483][00034] Fps is (10 sec: 1638.4, 60 sec: 1228.8, 300 sec: 1166.3). Total num frames: 2179072. Throughput: 0: 286.0. Samples: 544758. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 06:15:25,484][00034] Avg episode reward: [(0, '-6.014')] [2024-08-05 06:15:29,201][00139] DAMAGECOUNT value on done: 10170.0 [2024-08-05 06:15:29,201][00139] Sum rewards: -10.880, reward structure: {'DEATHCOUNT': '-9.750', 'FRAGCOUNT': '-4.500', 'HEALTH': '-0.889', 'AMMO5': '0.014', 'WEAPON1': '0.020', 'AMMO2': '0.032', 'AMMO3': '0.058', 'HITCOUNT': '0.080', 'ARMOR': '0.112', 'weapon5': '0.134', 'AMMO4': '0.160', 'DAMAGECOUNT': '0.222', 'weapon4': '0.274', 'WEAPON3': '0.300', 'WEAPON5': '0.300', 'WEAPON4': '0.350', 'weapon3': '0.452', 'weapon2': '1.750'} [2024-08-05 06:15:29,429][00139] DAMAGECOUNT value on done: 9920.0 [2024-08-05 06:15:29,430][00139] Sum rewards: -7.291, reward structure: {'DEATHCOUNT': '-10.500', 'HEALTH': '-0.912', 'FRAGCOUNT': '-0.500', 'AMMO5': '0.009', 'WEAPON1': '0.020', 'AMMO2': '0.026', 'weapon5': '0.052', 'weapon4': '0.060', 'HITCOUNT': '0.100', 'AMMO3': '0.103', 'AMMO4': '0.128', 'WEAPON4': '0.150', 'WEAPON5': '0.200', 'DAMAGECOUNT': '0.405', 'WEAPON3': '0.650', 'weapon3': '1.134', 'weapon2': '1.584'} [2024-08-05 06:15:30,483][00034] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1138.6). Total num frames: 2179072. Throughput: 0: 286.6. Samples: 545630. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 06:15:30,485][00034] Avg episode reward: [(0, '-6.066')] [2024-08-05 06:15:35,483][00034] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1138.5). Total num frames: 2187264. Throughput: 0: 288.2. Samples: 547367. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 06:15:35,485][00034] Avg episode reward: [(0, '-6.066')] [2024-08-05 06:15:35,494][00132] Saving /kaggle/working/train_dir/default_experiment/checkpoint_p0/checkpoint_000000267_2187264.pth... [2024-08-05 06:15:35,581][00132] Removing /kaggle/working/train_dir/default_experiment/checkpoint_p0/checkpoint_000000233_1908736.pth [2024-08-05 06:15:40,483][00034] Fps is (10 sec: 1638.4, 60 sec: 1228.8, 300 sec: 1166.3). Total num frames: 2195456. Throughput: 0: 286.3. Samples: 549032. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 06:15:40,484][00034] Avg episode reward: [(0, '-6.066')] [2024-08-05 06:15:43,855][00139] DAMAGECOUNT value on done: 10335.0 [2024-08-05 06:15:43,856][00139] Sum rewards: -3.958, reward structure: {'DEATHCOUNT': '-8.250', 'FRAGCOUNT': '-1.500', 'AMMO5': '0.010', 'weapon5': '0.024', 'HEALTH': '0.030', 'AMMO2': '0.036', 'ARMOR': '0.040', 'weapon7': '0.052', 'AMMO3': '0.094', 'HITCOUNT': '0.130', 'weapon4': '0.144', 'WEAPON5': '0.150', 'AMMO4': '0.179', 'AMMO6': '0.320', 'AMMO7': '0.320', 'WEAPON4': '0.400', 'WEAPON7': '0.400', 'WEAPON3': '0.400', 'DAMAGECOUNT': '0.495', 'weapon3': '1.122', 'weapon2': '1.446'} [2024-08-05 06:15:44,090][00139] DAMAGECOUNT value on done: 10178.0 [2024-08-05 06:15:44,090][00139] Sum rewards: -5.725, reward structure: {'DEATHCOUNT': '-10.500', 'HEALTH': '-1.831', 'WEAPON1': '0.010', 'AMMO5': '0.013', 'AMMO2': '0.033', 'weapon4': '0.064', 'ARMOR': '0.092', 'AMMO3': '0.159', 'AMMO4': '0.164', 'HITCOUNT': '0.180', 'WEAPON5': '0.250', 'WEAPON4': '0.350', 'DAMAGECOUNT': '0.774', 'WEAPON3': '0.900', 'FRAGCOUNT': '1.000', 'weapon2': '1.164', 'weapon3': '1.454'} [2024-08-05 06:15:45,483][00034] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1138.5). Total num frames: 2195456. Throughput: 0: 287.1. Samples: 549922. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 06:15:45,484][00034] Avg episode reward: [(0, '-6.044')] [2024-08-05 06:15:50,483][00034] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1138.5). Total num frames: 2203648. Throughput: 0: 286.8. Samples: 551627. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 06:15:50,485][00034] Avg episode reward: [(0, '-6.044')] [2024-08-05 06:15:53,863][00138] Updated weights for policy 0, policy_version 270 (0.0017) [2024-08-05 06:15:55,483][00034] Fps is (10 sec: 1638.4, 60 sec: 1228.8, 300 sec: 1138.5). Total num frames: 2211840. Throughput: 0: 286.2. Samples: 553351. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 06:15:55,484][00034] Avg episode reward: [(0, '-6.044')] [2024-08-05 06:15:58,544][00139] DAMAGECOUNT value on done: 10547.0 [2024-08-05 06:15:58,545][00139] Sum rewards: -10.622, reward structure: {'DEATHCOUNT': '-14.250', 'HEALTH': '-2.954', 'AMMO2': '0.051', 'ARMOR': '0.064', 'AMMO3': '0.139', 'HITCOUNT': '0.180', 'AMMO4': '0.256', 'weapon4': '0.378', 'WEAPON4': '0.600', 'DAMAGECOUNT': '0.636', 'WEAPON3': '0.700', 'weapon3': '0.806', 'FRAGCOUNT': '1.000', 'weapon2': '1.772'} [2024-08-05 06:15:58,777][00139] DAMAGECOUNT value on done: 10278.0 [2024-08-05 06:15:58,778][00139] Sum rewards: -7.117, reward structure: {'DEATHCOUNT': '-12.000', 'HEALTH': '-2.020', 'AMMO5': '0.005', 'weapon5': '0.010', 'AMMO2': '0.027', 'HITCOUNT': '0.070', 'ARMOR': '0.073', 'WEAPON5': '0.100', 'AMMO4': '0.135', 'AMMO3': '0.194', 'weapon4': '0.194', 'DAMAGECOUNT': '0.300', 'WEAPON4': '0.400', 'WEAPON3': '0.800', 'weapon3': '1.132', 'weapon2': '1.462', 'FRAGCOUNT': '2.000'} [2024-08-05 06:16:00,483][00034] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1138.5). Total num frames: 2211840. Throughput: 0: 287.2. Samples: 554220. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 06:16:00,484][00034] Avg episode reward: [(0, '-6.111')] [2024-08-05 06:16:05,483][00034] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1138.5). Total num frames: 2220032. Throughput: 0: 287.4. Samples: 555908. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 06:16:05,485][00034] Avg episode reward: [(0, '-6.111')] [2024-08-05 06:16:10,483][00034] Fps is (10 sec: 1638.4, 60 sec: 1228.8, 300 sec: 1138.5). Total num frames: 2228224. Throughput: 0: 286.0. Samples: 557629. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 06:16:10,485][00034] Avg episode reward: [(0, '-6.111')] [2024-08-05 06:16:13,263][00139] DAMAGECOUNT value on done: 10693.0 [2024-08-05 06:16:13,263][00139] Sum rewards: -2.993, reward structure: {'DEATHCOUNT': '-6.750', 'HEALTH': '-1.670', 'weapon5': '0.010', 'AMMO5': '0.013', 'AMMO2': '0.018', 'WEAPON1': '0.020', 'ARMOR': '0.048', 'weapon4': '0.070', 'AMMO4': '0.089', 'WEAPON4': '0.100', 'AMMO3': '0.110', 'HITCOUNT': '0.120', 'WEAPON5': '0.200', 'DAMAGECOUNT': '0.438', 'WEAPON3': '0.600', 'FRAGCOUNT': '1.000', 'weapon2': '1.158', 'weapon3': '1.434'} [2024-08-05 06:16:13,493][00139] DAMAGECOUNT value on done: 10433.0 [2024-08-05 06:16:13,494][00139] Sum rewards: -5.311, reward structure: {'DEATHCOUNT': '-12.000', 'HEALTH': '-1.208', 'AMMO5': '0.005', 'weapon4': '0.014', 'AMMO2': '0.017', 'ARMOR': '0.064', 'AMMO4': '0.084', 'WEAPON5': '0.100', 'HITCOUNT': '0.150', 'AMMO3': '0.174', 'WEAPON4': '0.200', 'DAMAGECOUNT': '0.465', 'WEAPON3': '0.850', 'weapon2': '1.348', 'weapon3': '1.426', 'FRAGCOUNT': '3.000'} [2024-08-05 06:16:15,483][00034] Fps is (10 sec: 1638.4, 60 sec: 1228.8, 300 sec: 1166.3). Total num frames: 2236416. Throughput: 0: 285.9. Samples: 558497. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 06:16:15,484][00034] Avg episode reward: [(0, '-6.064')] [2024-08-05 06:16:20,483][00034] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1138.5). Total num frames: 2236416. Throughput: 0: 285.7. Samples: 560222. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 06:16:20,485][00034] Avg episode reward: [(0, '-6.064')] [2024-08-05 06:16:25,483][00034] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1138.5). Total num frames: 2244608. Throughput: 0: 286.0. Samples: 561903. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 06:16:25,485][00034] Avg episode reward: [(0, '-6.064')] [2024-08-05 06:16:27,928][00139] DAMAGECOUNT value on done: 10870.0 [2024-08-05 06:16:28,142][00139] DAMAGECOUNT value on done: 10688.0 [2024-08-05 06:16:28,143][00139] Sum rewards: -4.668, reward structure: {'DEATHCOUNT': '-11.250', 'HEALTH': '-1.820', 'AMMO2': '0.011', 'AMMO5': '0.025', 'weapon4': '0.032', 'weapon5': '0.034', 'WEAPON1': '0.040', 'WEAPON4': '0.050', 'AMMO4': '0.054', 'HITCOUNT': '0.180', 'AMMO3': '0.201', 'WEAPON5': '0.350', 'DAMAGECOUNT': '0.765', 'WEAPON3': '1.050', 'weapon2': '1.168', 'weapon3': '1.442', 'FRAGCOUNT': '3.000'} [2024-08-05 06:16:30,483][00034] Fps is (10 sec: 1638.4, 60 sec: 1228.8, 300 sec: 1166.3). Total num frames: 2252800. Throughput: 0: 286.1. Samples: 562796. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 06:16:30,484][00034] Avg episode reward: [(0, '-5.908')] [2024-08-05 06:16:30,487][00132] Saving new best policy, reward=-5.908! [2024-08-05 06:16:35,483][00034] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1138.5). Total num frames: 2252800. Throughput: 0: 286.7. Samples: 564530. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 06:16:35,484][00034] Avg episode reward: [(0, '-5.908')] [2024-08-05 06:16:40,483][00034] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1138.5). Total num frames: 2260992. Throughput: 0: 286.6. Samples: 566247. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 06:16:40,484][00034] Avg episode reward: [(0, '-5.908')] [2024-08-05 06:16:42,441][00139] DAMAGECOUNT value on done: 10997.0 [2024-08-05 06:16:42,441][00139] Sum rewards: -2.097, reward structure: {'DEATHCOUNT': '-6.750', 'HEALTH': '-1.228', 'AMMO2': '0.001', 'AMMO4': '0.006', 'weapon7': '0.024', 'WEAPON4': '0.050', 'HITCOUNT': '0.080', 'AMMO6': '0.100', 'WEAPON7': '0.100', 'AMMO7': '0.100', 'AMMO3': '0.121', 'weapon4': '0.148', 'DAMAGECOUNT': '0.381', 'ARMOR': '0.580', 'WEAPON3': '0.650', 'FRAGCOUNT': '1.000', 'weapon3': '1.140', 'weapon2': '1.400'} [2024-08-05 06:16:42,656][00139] DAMAGECOUNT value on done: 10843.0 [2024-08-05 06:16:42,657][00139] Sum rewards: -11.119, reward structure: {'DEATHCOUNT': '-9.000', 'FRAGCOUNT': '-4.500', 'HEALTH': '-1.970', 'AMMO2': '0.018', 'AMMO5': '0.019', 'weapon4': '0.024', 'WEAPON1': '0.030', 'weapon5': '0.048', 'AMMO3': '0.088', 'AMMO4': '0.089', 'HITCOUNT': '0.140', 'WEAPON4': '0.200', 'WEAPON5': '0.250', 'DAMAGECOUNT': '0.465', 'WEAPON3': '0.550', 'weapon2': '1.010', 'weapon3': '1.420'} [2024-08-05 06:16:45,483][00034] Fps is (10 sec: 1638.4, 60 sec: 1228.8, 300 sec: 1138.5). Total num frames: 2269184. Throughput: 0: 287.1. Samples: 567138. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 06:16:45,485][00034] Avg episode reward: [(0, '-5.914')] [2024-08-05 06:16:50,483][00034] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1138.5). Total num frames: 2269184. Throughput: 0: 288.0. Samples: 568867. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 06:16:50,484][00034] Avg episode reward: [(0, '-5.914')] [2024-08-05 06:16:55,483][00034] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1138.5). Total num frames: 2277376. Throughput: 0: 286.3. Samples: 570511. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 06:16:55,484][00034] Avg episode reward: [(0, '-5.914')] [2024-08-05 06:16:57,286][00139] DAMAGECOUNT value on done: 11047.0 [2024-08-05 06:16:57,526][00139] DAMAGECOUNT value on done: 10848.0 [2024-08-05 06:17:00,483][00034] Fps is (10 sec: 1638.4, 60 sec: 1228.8, 300 sec: 1138.5). Total num frames: 2285568. Throughput: 0: 286.2. Samples: 571374. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 06:17:00,485][00034] Avg episode reward: [(0, '-5.869')] [2024-08-05 06:17:00,486][00132] Saving new best policy, reward=-5.869! [2024-08-05 06:17:05,380][00138] Updated weights for policy 0, policy_version 280 (0.0018) [2024-08-05 06:17:05,483][00034] Fps is (10 sec: 1638.4, 60 sec: 1228.8, 300 sec: 1166.3). Total num frames: 2293760. Throughput: 0: 286.2. Samples: 573102. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 06:17:05,484][00034] Avg episode reward: [(0, '-5.869')] [2024-08-05 06:17:10,483][00034] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1138.5). Total num frames: 2293760. Throughput: 0: 287.7. Samples: 574851. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 06:17:10,484][00034] Avg episode reward: [(0, '-5.869')] [2024-08-05 06:17:11,770][00139] DAMAGECOUNT value on done: 11167.0 [2024-08-05 06:17:11,771][00139] Sum rewards: -3.958, reward structure: {'DEATHCOUNT': '-8.250', 'HEALTH': '-1.638', 'AMMO5': '0.013', 'AMMO2': '0.015', 'weapon4': '0.030', 'weapon5': '0.040', 'WEAPON1': '0.050', 'AMMO4': '0.074', 'WEAPON5': '0.150', 'AMMO3': '0.158', 'HITCOUNT': '0.160', 'WEAPON4': '0.200', 'DAMAGECOUNT': '0.360', 'ARMOR': '0.500', 'WEAPON3': '0.550', 'weapon2': '0.984', 'FRAGCOUNT': '1.000', 'weapon3': '1.646'} [2024-08-05 06:17:11,979][00139] DAMAGECOUNT value on done: 10938.0 [2024-08-05 06:17:11,979][00139] Sum rewards: -6.533, reward structure: {'DEATHCOUNT': '-10.500', 'HEALTH': '-1.394', 'weapon5': '0.002', 'WEAPON1': '0.010', 'AMMO5': '0.010', 'AMMO2': '0.024', 'HITCOUNT': '0.100', 'WEAPON5': '0.100', 'weapon4': '0.104', 'AMMO4': '0.118', 'WEAPON4': '0.150', 'AMMO3': '0.155', 'DAMAGECOUNT': '0.270', 'weapon2': '0.836', 'WEAPON3': '0.850', 'FRAGCOUNT': '1.000', 'weapon3': '1.632'} [2024-08-05 06:17:15,483][00034] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1138.5). Total num frames: 2301952. Throughput: 0: 287.0. Samples: 575711. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 06:17:15,485][00034] Avg episode reward: [(0, '-5.837')] [2024-08-05 06:17:15,491][00132] Saving new best policy, reward=-5.837! [2024-08-05 06:17:20,483][00034] Fps is (10 sec: 1638.4, 60 sec: 1228.8, 300 sec: 1166.3). Total num frames: 2310144. Throughput: 0: 287.2. Samples: 577452. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 06:17:20,484][00034] Avg episode reward: [(0, '-5.837')] [2024-08-05 06:17:25,483][00034] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1138.5). Total num frames: 2310144. Throughput: 0: 286.8. Samples: 579154. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 06:17:25,485][00034] Avg episode reward: [(0, '-5.837')] [2024-08-05 06:17:26,451][00139] DAMAGECOUNT value on done: 11391.0 [2024-08-05 06:17:26,451][00139] Sum rewards: -5.930, reward structure: {'DEATHCOUNT': '-10.500', 'HEALTH': '-2.502', 'AMMO5': '0.016', 'AMMO2': '0.025', 'WEAPON1': '0.050', 'weapon5': '0.088', 'HITCOUNT': '0.090', 'WEAPON4': '0.100', 'weapon4': '0.108', 'AMMO4': '0.125', 'AMMO3': '0.141', 'WEAPON5': '0.350', 'DAMAGECOUNT': '0.672', 'WEAPON3': '0.850', 'weapon2': '1.216', 'weapon3': '1.240', 'FRAGCOUNT': '2.000'} [2024-08-05 06:17:26,738][00139] DAMAGECOUNT value on done: 10985.0 [2024-08-05 06:17:26,738][00139] Sum rewards: -5.180, reward structure: {'DEATHCOUNT': '-9.000', 'HEALTH': '-1.956', 'AMMO4': '-0.015', 'AMMO2': '-0.003', 'AMMO5': '0.013', 'WEAPON1': '0.030', 'HITCOUNT': '0.050', 'weapon7': '0.060', 'DAMAGECOUNT': '0.141', 'WEAPON5': '0.150', 'AMMO6': '0.160', 'AMMO7': '0.160', 'AMMO3': '0.196', 'WEAPON7': '0.200', 'WEAPON3': '1.000', 'FRAGCOUNT': '1.000', 'weapon2': '1.142', 'weapon3': '1.492'} [2024-08-05 06:17:30,483][00034] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1138.5). Total num frames: 2318336. Throughput: 0: 286.1. Samples: 580013. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 06:17:30,484][00034] Avg episode reward: [(0, '-5.801')] [2024-08-05 06:17:30,487][00132] Saving new best policy, reward=-5.801! [2024-08-05 06:17:35,483][00034] Fps is (10 sec: 1638.4, 60 sec: 1228.8, 300 sec: 1138.5). Total num frames: 2326528. Throughput: 0: 285.7. Samples: 581723. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 06:17:35,484][00034] Avg episode reward: [(0, '-5.801')] [2024-08-05 06:17:35,493][00132] Saving /kaggle/working/train_dir/default_experiment/checkpoint_p0/checkpoint_000000284_2326528.pth... [2024-08-05 06:17:35,564][00132] Removing /kaggle/working/train_dir/default_experiment/checkpoint_p0/checkpoint_000000250_2048000.pth [2024-08-05 06:17:40,483][00034] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1138.5). Total num frames: 2326528. Throughput: 0: 289.0. Samples: 583517. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 06:17:40,485][00034] Avg episode reward: [(0, '-5.801')] [2024-08-05 06:17:40,912][00139] DAMAGECOUNT value on done: 11431.0 [2024-08-05 06:17:41,141][00139] DAMAGECOUNT value on done: 11135.0 [2024-08-05 06:17:45,483][00034] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1138.5). Total num frames: 2334720. Throughput: 0: 289.7. Samples: 584409. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 06:17:45,485][00034] Avg episode reward: [(0, '-5.748')] [2024-08-05 06:17:45,492][00132] Saving new best policy, reward=-5.748! [2024-08-05 06:17:50,483][00034] Fps is (10 sec: 1638.4, 60 sec: 1228.8, 300 sec: 1138.5). Total num frames: 2342912. Throughput: 0: 290.2. Samples: 586162. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 06:17:50,484][00034] Avg episode reward: [(0, '-5.748')] [2024-08-05 06:17:55,165][00139] DAMAGECOUNT value on done: 11491.0 [2024-08-05 06:17:55,437][00139] DAMAGECOUNT value on done: 11515.0 [2024-08-05 06:17:55,437][00139] Sum rewards: -1.430, reward structure: {'DEATHCOUNT': '-9.000', 'HEALTH': '-0.517', 'AMMO5': '0.010', 'WEAPON1': '0.010', 'AMMO2': '0.011', 'weapon4': '0.034', 'ARMOR': '0.044', 'AMMO4': '0.052', 'weapon5': '0.090', 'WEAPON4': '0.100', 'AMMO3': '0.140', 'HITCOUNT': '0.190', 'WEAPON5': '0.200', 'WEAPON3': '0.650', 'weapon2': '1.062', 'DAMAGECOUNT': '1.140', 'weapon3': '1.354', 'FRAGCOUNT': '3.000'} [2024-08-05 06:17:55,483][00034] Fps is (10 sec: 1638.4, 60 sec: 1228.8, 300 sec: 1166.3). Total num frames: 2351104. Throughput: 0: 290.6. Samples: 587926. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 06:17:55,485][00034] Avg episode reward: [(0, '-5.689')] [2024-08-05 06:17:55,493][00132] Saving new best policy, reward=-5.689! [2024-08-05 06:18:00,483][00034] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1138.5). Total num frames: 2351104. Throughput: 0: 290.4. Samples: 588779. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 06:18:00,485][00034] Avg episode reward: [(0, '-5.689')] [2024-08-05 06:18:05,483][00034] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1138.5). Total num frames: 2359296. Throughput: 0: 289.7. Samples: 590489. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 06:18:05,485][00034] Avg episode reward: [(0, '-5.689')] [2024-08-05 06:18:09,891][00139] DAMAGECOUNT value on done: 11868.0 [2024-08-05 06:18:09,891][00139] Sum rewards: -0.804, reward structure: {'DEATHCOUNT': '-8.250', 'HEALTH': '-0.070', 'AMMO5': '0.011', 'WEAPON1': '0.020', 'AMMO2': '0.030', 'weapon7': '0.036', 'weapon5': '0.042', 'AMMO3': '0.100', 'weapon4': '0.130', 'AMMO4': '0.149', 'WEAPON4': '0.150', 'HITCOUNT': '0.160', 'WEAPON5': '0.250', 'AMMO6': '0.260', 'AMMO7': '0.260', 'WEAPON7': '0.300', 'WEAPON3': '0.550', 'weapon3': '1.002', 'DAMAGECOUNT': '1.131', 'weapon2': '1.434', 'FRAGCOUNT': '1.500'} [2024-08-05 06:18:10,120][00139] DAMAGECOUNT value on done: 11720.0 [2024-08-05 06:18:10,121][00139] Sum rewards: -2.763, reward structure: {'DEATHCOUNT': '-6.000', 'FRAGCOUNT': '-1.500', 'AMMO5': '0.007', 'WEAPON1': '0.010', 'AMMO2': '0.029', 'ARMOR': '0.048', 'AMMO3': '0.056', 'HITCOUNT': '0.070', 'weapon5': '0.122', 'AMMO4': '0.146', 'WEAPON5': '0.150', 'weapon4': '0.152', 'WEAPON4': '0.200', 'WEAPON3': '0.250', 'HEALTH': '0.251', 'DAMAGECOUNT': '0.615', 'weapon3': '0.718', 'weapon2': '1.912'} [2024-08-05 06:18:10,483][00034] Fps is (10 sec: 1638.4, 60 sec: 1228.8, 300 sec: 1166.3). Total num frames: 2367488. Throughput: 0: 290.0. Samples: 592202. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 06:18:10,484][00034] Avg episode reward: [(0, '-5.622')] [2024-08-05 06:18:10,486][00132] Saving new best policy, reward=-5.622! [2024-08-05 06:18:15,483][00034] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1138.5). Total num frames: 2367488. Throughput: 0: 290.0. Samples: 593061. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 06:18:15,485][00034] Avg episode reward: [(0, '-5.622')] [2024-08-05 06:18:16,162][00138] Updated weights for policy 0, policy_version 290 (0.0017) [2024-08-05 06:18:20,483][00034] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1138.5). Total num frames: 2375680. Throughput: 0: 290.1. Samples: 594776. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 06:18:20,484][00034] Avg episode reward: [(0, '-5.622')] [2024-08-05 06:18:24,502][00139] DAMAGECOUNT value on done: 12247.0 [2024-08-05 06:18:24,503][00139] Sum rewards: -9.639, reward structure: {'DEATHCOUNT': '-13.500', 'HEALTH': '-2.636', 'AMMO2': '0.004', 'AMMO4': '0.020', 'WEAPON1': '0.020', 'AMMO5': '0.021', 'weapon5': '0.064', 'AMMO3': '0.187', 'HITCOUNT': '0.290', 'WEAPON5': '0.400', 'ARMOR': '0.400', 'FRAGCOUNT': '0.500', 'WEAPON3': '0.900', 'DAMAGECOUNT': '1.137', 'weapon2': '1.252', 'weapon3': '1.302'} [2024-08-05 06:18:24,736][00139] DAMAGECOUNT value on done: 11752.0 [2024-08-05 06:18:24,736][00139] Sum rewards: -3.956, reward structure: {'DEATHCOUNT': '-9.750', 'AMMO5': '0.007', 'weapon5': '0.018', 'AMMO2': '0.026', 'ARMOR': '0.040', 'HITCOUNT': '0.040', 'weapon7': '0.042', 'DAMAGECOUNT': '0.096', 'AMMO3': '0.097', 'AMMO6': '0.100', 'WEAPON7': '0.100', 'AMMO7': '0.100', 'AMMO4': '0.131', 'WEAPON5': '0.150', 'weapon4': '0.214', 'WEAPON4': '0.250', 'HEALTH': '0.318', 'WEAPON3': '0.500', 'weapon3': '0.802', 'FRAGCOUNT': '1.000', 'weapon2': '1.762'} [2024-08-05 06:18:25,483][00034] Fps is (10 sec: 1638.4, 60 sec: 1228.8, 300 sec: 1166.3). Total num frames: 2383872. Throughput: 0: 288.7. Samples: 596507. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 06:18:25,485][00034] Avg episode reward: [(0, '-5.650')] [2024-08-05 06:18:30,483][00034] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1138.5). Total num frames: 2383872. Throughput: 0: 287.3. Samples: 597336. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 06:18:30,484][00034] Avg episode reward: [(0, '-5.650')] [2024-08-05 06:18:35,483][00034] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1138.5). Total num frames: 2392064. Throughput: 0: 288.4. Samples: 599142. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 06:18:35,484][00034] Avg episode reward: [(0, '-5.650')] [2024-08-05 06:18:38,884][00139] DAMAGECOUNT value on done: 12282.0 [2024-08-05 06:18:39,099][00139] DAMAGECOUNT value on done: 11966.0 [2024-08-05 06:18:39,100][00139] Sum rewards: -6.874, reward structure: {'DEATHCOUNT': '-13.500', 'HEALTH': '-1.832', 'AMMO5': '0.005', 'AMMO2': '0.019', 'weapon4': '0.022', 'ARMOR': '0.024', 'AMMO4': '0.093', 'WEAPON5': '0.100', 'HITCOUNT': '0.180', 'WEAPON4': '0.200', 'AMMO3': '0.201', 'DAMAGECOUNT': '0.642', 'WEAPON3': '1.050', 'weapon2': '1.406', 'weapon3': '1.516', 'FRAGCOUNT': '3.000'} [2024-08-05 06:18:40,483][00034] Fps is (10 sec: 1638.4, 60 sec: 1228.8, 300 sec: 1138.5). Total num frames: 2400256. Throughput: 0: 288.3. Samples: 600900. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 06:18:40,485][00034] Avg episode reward: [(0, '-5.680')] [2024-08-05 06:18:45,483][00034] Fps is (10 sec: 1638.4, 60 sec: 1228.8, 300 sec: 1166.3). Total num frames: 2408448. Throughput: 0: 288.9. Samples: 601780. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 06:18:45,484][00034] Avg episode reward: [(0, '-5.680')] [2024-08-05 06:18:50,483][00034] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1138.5). Total num frames: 2408448. Throughput: 0: 290.2. Samples: 603547. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 06:18:50,484][00034] Avg episode reward: [(0, '-5.680')] [2024-08-05 06:18:53,258][00139] DAMAGECOUNT value on done: 12342.0 [2024-08-05 06:18:53,488][00139] DAMAGECOUNT value on done: 11986.0 [2024-08-05 06:18:55,483][00034] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1138.5). Total num frames: 2416640. Throughput: 0: 290.4. Samples: 605268. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 06:18:55,484][00034] Avg episode reward: [(0, '-5.657')] [2024-08-05 06:19:00,483][00034] Fps is (10 sec: 1638.4, 60 sec: 1228.8, 300 sec: 1166.3). Total num frames: 2424832. Throughput: 0: 290.7. Samples: 606143. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 06:19:00,484][00034] Avg episode reward: [(0, '-5.657')] [2024-08-05 06:19:04,504][00139] Large shaping reward -2.549 for [('FRAGCOUNT', -1.5, -1.0), ('DEATHCOUNT', -0.75, 1.0), ('HEALTH', -0.3, -100.0), ('AMMO5', -0.0005, -1.0), ('weapon5', 0.002)] [2024-08-05 06:19:05,483][00034] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1138.5). Total num frames: 2424832. Throughput: 0: 289.9. Samples: 607820. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 06:19:05,485][00034] Avg episode reward: [(0, '-5.657')] [2024-08-05 06:19:07,967][00139] DAMAGECOUNT value on done: 12463.0 [2024-08-05 06:19:07,967][00139] Sum rewards: -8.633, reward structure: {'DEATHCOUNT': '-10.500', 'FRAGCOUNT': '-1.500', 'HEALTH': '-1.370', 'AMMO5': '0.011', 'weapon4': '0.012', 'AMMO2': '0.014', 'weapon5': '0.036', 'ARMOR': '0.052', 'HITCOUNT': '0.070', 'AMMO4': '0.071', 'WEAPON4': '0.150', 'AMMO3': '0.159', 'WEAPON5': '0.250', 'DAMAGECOUNT': '0.363', 'weapon3': '0.780', 'WEAPON3': '0.800', 'weapon2': '1.968'} [2024-08-05 06:19:08,173][00139] DAMAGECOUNT value on done: 12178.0 [2024-08-05 06:19:10,483][00034] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1138.5). Total num frames: 2433024. Throughput: 0: 290.6. Samples: 609585. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 06:19:10,484][00034] Avg episode reward: [(0, '-5.675')] [2024-08-05 06:19:15,483][00034] Fps is (10 sec: 1638.4, 60 sec: 1228.8, 300 sec: 1166.3). Total num frames: 2441216. Throughput: 0: 291.9. Samples: 610472. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 06:19:15,485][00034] Avg episode reward: [(0, '-5.675')] [2024-08-05 06:19:20,483][00034] Fps is (10 sec: 1638.4, 60 sec: 1228.8, 300 sec: 1166.3). Total num frames: 2449408. Throughput: 0: 290.1. Samples: 612197. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 06:19:20,484][00034] Avg episode reward: [(0, '-5.675')] [2024-08-05 06:19:22,301][00139] DAMAGECOUNT value on done: 12629.0 [2024-08-05 06:19:22,302][00139] Sum rewards: -2.991, reward structure: {'DEATHCOUNT': '-8.250', 'HEALTH': '-0.685', 'ARMOR': '0.008', 'AMMO2': '0.016', 'WEAPON1': '0.020', 'weapon5': '0.022', 'AMMO5': '0.022', 'weapon4': '0.076', 'AMMO4': '0.081', 'AMMO3': '0.120', 'HITCOUNT': '0.120', 'WEAPON4': '0.150', 'WEAPON5': '0.300', 'DAMAGECOUNT': '0.498', 'WEAPON3': '0.650', 'FRAGCOUNT': '1.000', 'weapon2': '1.018', 'weapon3': '1.842'} [2024-08-05 06:19:22,523][00139] DAMAGECOUNT value on done: 12238.0 [2024-08-05 06:19:25,483][00034] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1138.5). Total num frames: 2449408. Throughput: 0: 291.4. Samples: 614014. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 06:19:25,484][00034] Avg episode reward: [(0, '-5.611')] [2024-08-05 06:19:25,492][00132] Saving new best policy, reward=-5.611! [2024-08-05 06:19:26,562][00138] Updated weights for policy 0, policy_version 300 (0.0017) [2024-08-05 06:19:30,483][00034] Fps is (10 sec: 819.2, 60 sec: 1228.8, 300 sec: 1138.5). Total num frames: 2457600. Throughput: 0: 290.9. Samples: 614871. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 06:19:30,486][00034] Avg episode reward: [(0, '-5.611')] [2024-08-05 06:19:35,483][00034] Fps is (10 sec: 1638.4, 60 sec: 1228.8, 300 sec: 1166.3). Total num frames: 2465792. Throughput: 0: 288.4. Samples: 616523. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 06:19:35,484][00034] Avg episode reward: [(0, '-5.611')] [2024-08-05 06:19:35,493][00132] Saving /kaggle/working/train_dir/default_experiment/checkpoint_p0/checkpoint_000000301_2465792.pth... [2024-08-05 06:19:35,579][00132] Removing /kaggle/working/train_dir/default_experiment/checkpoint_p0/checkpoint_000000267_2187264.pth [2024-08-05 06:19:36,955][00139] DAMAGECOUNT value on done: 12789.0 [2024-08-05 06:19:36,955][00139] Sum rewards: -2.176, reward structure: {'DEATHCOUNT': '-6.750', 'HEALTH': '-1.320', 'AMMO5': '0.003', 'AMMO2': '0.011', 'HITCOUNT': '0.040', 'AMMO4': '0.053', 'weapon5': '0.080', 'WEAPON5': '0.100', 'AMMO3': '0.113', 'weapon4': '0.158', 'WEAPON4': '0.200', 'DAMAGECOUNT': '0.480', 'ARMOR': '0.540', 'WEAPON3': '0.600', 'FRAGCOUNT': '1.000', 'weapon3': '1.252', 'weapon2': '1.264'} [2024-08-05 06:19:37,162][00139] DAMAGECOUNT value on done: 12368.0 [2024-08-05 06:19:37,163][00139] Sum rewards: -3.992, reward structure: {'DEATHCOUNT': '-10.500', 'HEALTH': '-1.470', 'AMMO5': '0.010', 'WEAPON1': '0.010', 'AMMO2': '0.013', 'ARMOR': '0.018', 'weapon4': '0.040', 'weapon5': '0.054', 'AMMO4': '0.064', 'HITCOUNT': '0.130', 'WEAPON4': '0.150', 'AMMO3': '0.165', 'WEAPON5': '0.200', 'DAMAGECOUNT': '0.390', 'WEAPON3': '0.900', 'weapon3': '1.408', 'weapon2': '1.426', 'FRAGCOUNT': '3.000'} [2024-08-05 06:19:40,483][00034] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1138.5). Total num frames: 2465792. Throughput: 0: 288.6. Samples: 618257. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 06:19:40,484][00034] Avg episode reward: [(0, '-5.523')] [2024-08-05 06:19:40,486][00132] Saving new best policy, reward=-5.523! [2024-08-05 06:19:45,483][00034] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1138.5). Total num frames: 2473984. Throughput: 0: 287.4. Samples: 619077. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 06:19:45,484][00034] Avg episode reward: [(0, '-5.523')] [2024-08-05 06:19:50,483][00034] Fps is (10 sec: 1638.4, 60 sec: 1228.8, 300 sec: 1166.3). Total num frames: 2482176. Throughput: 0: 288.9. Samples: 620822. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 06:19:50,485][00034] Avg episode reward: [(0, '-5.523')] [2024-08-05 06:19:51,476][00139] DAMAGECOUNT value on done: 13082.0 [2024-08-05 06:19:51,476][00139] Sum rewards: -6.847, reward structure: {'DEATHCOUNT': '-12.000', 'HEALTH': '-2.480', 'weapon5': '0.004', 'AMMO2': '0.010', 'AMMO5': '0.010', 'AMMO4': '0.048', 'weapon4': '0.066', 'WEAPON4': '0.100', 'WEAPON5': '0.150', 'AMMO3': '0.232', 'HITCOUNT': '0.240', 'DAMAGECOUNT': '0.879', 'WEAPON3': '1.100', 'weapon2': '1.222', 'weapon3': '1.572', 'FRAGCOUNT': '2.000'} [2024-08-05 06:19:51,689][00139] DAMAGECOUNT value on done: 12435.0 [2024-08-05 06:19:55,483][00034] Fps is (10 sec: 1638.4, 60 sec: 1228.8, 300 sec: 1166.3). Total num frames: 2490368. Throughput: 0: 289.3. Samples: 622602. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 06:19:55,484][00034] Avg episode reward: [(0, '-5.512')] [2024-08-05 06:19:55,493][00132] Saving new best policy, reward=-5.512! [2024-08-05 06:20:00,483][00034] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1138.5). Total num frames: 2490368. Throughput: 0: 289.1. Samples: 623481. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 06:20:00,485][00034] Avg episode reward: [(0, '-5.512')] [2024-08-05 06:20:05,483][00034] Fps is (10 sec: 819.2, 60 sec: 1228.8, 300 sec: 1166.3). Total num frames: 2498560. Throughput: 0: 288.1. Samples: 625161. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 06:20:05,485][00034] Avg episode reward: [(0, '-5.512')] [2024-08-05 06:20:06,153][00139] DAMAGECOUNT value on done: 13167.0 [2024-08-05 06:20:06,153][00139] Sum rewards: -8.449, reward structure: {'DEATHCOUNT': '-11.250', 'HEALTH': '-1.194', 'FRAGCOUNT': '-0.500', 'AMMO5': '0.007', 'AMMO2': '0.008', 'AMMO4': '0.040', 'weapon5': '0.078', 'HITCOUNT': '0.100', 'WEAPON4': '0.100', 'weapon4': '0.108', 'WEAPON5': '0.150', 'AMMO3': '0.162', 'DAMAGECOUNT': '0.255', 'WEAPON3': '0.750', 'weapon3': '0.892', 'weapon2': '1.844'} [2024-08-05 06:20:06,389][00139] DAMAGECOUNT value on done: 12573.0 [2024-08-05 06:20:06,390][00139] Sum rewards: -3.341, reward structure: {'DEATHCOUNT': '-8.250', 'HEALTH': '-1.500', 'ARMOR': '0.008', 'AMMO2': '0.017', 'AMMO5': '0.018', 'WEAPON1': '0.020', 'weapon5': '0.064', 'AMMO4': '0.085', 'HITCOUNT': '0.140', 'AMMO3': '0.146', 'WEAPON5': '0.400', 'DAMAGECOUNT': '0.414', 'WEAPON3': '0.650', 'weapon2': '1.222', 'weapon3': '1.224', 'FRAGCOUNT': '2.000'} [2024-08-05 06:20:10,483][00034] Fps is (10 sec: 1638.4, 60 sec: 1228.8, 300 sec: 1166.3). Total num frames: 2506752. Throughput: 0: 285.4. Samples: 626859. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 06:20:10,484][00034] Avg episode reward: [(0, '-5.498')] [2024-08-05 06:20:10,487][00132] Saving new best policy, reward=-5.498! [2024-08-05 06:20:15,483][00034] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1138.5). Total num frames: 2506752. Throughput: 0: 285.0. Samples: 627694. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 06:20:15,484][00034] Avg episode reward: [(0, '-5.498')] [2024-08-05 06:20:20,483][00034] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1138.5). Total num frames: 2514944. Throughput: 0: 285.1. Samples: 629351. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 06:20:20,484][00034] Avg episode reward: [(0, '-5.498')] [2024-08-05 06:20:21,086][00139] DAMAGECOUNT value on done: 13272.0 [2024-08-05 06:20:21,087][00139] Sum rewards: -6.100, reward structure: {'DEATHCOUNT': '-10.500', 'HEALTH': '-1.782', 'WEAPON1': '0.010', 'AMMO5': '0.018', 'AMMO2': '0.029', 'weapon5': '0.032', 'ARMOR': '0.076', 'HITCOUNT': '0.090', 'AMMO4': '0.143', 'AMMO3': '0.160', 'WEAPON4': '0.250', 'DAMAGECOUNT': '0.315', 'weapon4': '0.324', 'WEAPON5': '0.350', 'weapon3': '0.620', 'WEAPON3': '0.850', 'FRAGCOUNT': '1.000', 'weapon2': '1.916'} [2024-08-05 06:20:21,326][00139] DAMAGECOUNT value on done: 12761.0 [2024-08-05 06:20:25,483][00034] Fps is (10 sec: 1638.4, 60 sec: 1228.8, 300 sec: 1166.3). Total num frames: 2523136. Throughput: 0: 284.3. Samples: 631050. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 06:20:25,484][00034] Avg episode reward: [(0, '-5.520')] [2024-08-05 06:20:30,483][00034] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1138.5). Total num frames: 2523136. Throughput: 0: 285.5. Samples: 631924. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 06:20:30,485][00034] Avg episode reward: [(0, '-5.520')] [2024-08-05 06:20:35,483][00034] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1138.5). Total num frames: 2531328. Throughput: 0: 283.4. Samples: 633576. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 06:20:35,485][00034] Avg episode reward: [(0, '-5.520')] [2024-08-05 06:20:36,087][00139] DAMAGECOUNT value on done: 13677.0 [2024-08-05 06:20:36,087][00139] Sum rewards: -1.248, reward structure: {'DEATHCOUNT': '-7.500', 'HEALTH': '-1.680', 'WEAPON1': '0.010', 'AMMO5': '0.025', 'AMMO2': '0.027', 'weapon7': '0.076', 'weapon5': '0.086', 'AMMO6': '0.100', 'AMMO7': '0.100', 'WEAPON7': '0.100', 'AMMO3': '0.120', 'AMMO4': '0.134', 'HITCOUNT': '0.140', 'WEAPON4': '0.200', 'weapon4': '0.392', 'WEAPON5': '0.400', 'WEAPON3': '0.700', 'weapon3': '0.798', 'DAMAGECOUNT': '1.215', 'weapon2': '1.310', 'FRAGCOUNT': '2.000'} [2024-08-05 06:20:36,303][00139] DAMAGECOUNT value on done: 13264.0 [2024-08-05 06:20:36,303][00139] Sum rewards: -2.599, reward structure: {'DEATHCOUNT': '-8.250', 'HEALTH': '-0.930', 'AMMO2': '0.017', 'weapon7': '0.070', 'ARMOR': '0.080', 'AMMO4': '0.083', 'AMMO3': '0.114', 'AMMO6': '0.120', 'AMMO7': '0.120', 'HITCOUNT': '0.140', 'WEAPON7': '0.200', 'WEAPON4': '0.250', 'weapon4': '0.264', 'WEAPON3': '0.650', 'DAMAGECOUNT': '0.909', 'FRAGCOUNT': '1.000', 'weapon3': '1.126', 'weapon2': '1.438'} [2024-08-05 06:20:38,657][00138] Updated weights for policy 0, policy_version 310 (0.0017) [2024-08-05 06:20:40,483][00034] Fps is (10 sec: 1638.4, 60 sec: 1228.8, 300 sec: 1166.3). Total num frames: 2539520. Throughput: 0: 282.4. Samples: 635312. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 06:20:40,484][00034] Avg episode reward: [(0, '-5.499')] [2024-08-05 06:20:45,483][00034] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1138.5). Total num frames: 2539520. Throughput: 0: 282.9. Samples: 636210. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 06:20:45,484][00034] Avg episode reward: [(0, '-5.499')] [2024-08-05 06:20:50,483][00034] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1138.5). Total num frames: 2547712. Throughput: 0: 283.5. Samples: 637917. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 06:20:50,484][00034] Avg episode reward: [(0, '-5.499')] [2024-08-05 06:20:50,611][00139] DAMAGECOUNT value on done: 13929.0 [2024-08-05 06:20:50,612][00139] Sum rewards: -4.055, reward structure: {'DEATHCOUNT': '-9.750', 'HEALTH': '-1.876', 'WEAPON1': '0.010', 'AMMO2': '0.015', 'AMMO5': '0.015', 'weapon4': '0.062', 'weapon7': '0.068', 'AMMO4': '0.074', 'HITCOUNT': '0.090', 'WEAPON4': '0.100', 'weapon5': '0.116', 'AMMO6': '0.120', 'AMMO7': '0.120', 'AMMO3': '0.183', 'WEAPON7': '0.200', 'WEAPON5': '0.300', 'DAMAGECOUNT': '0.756', 'WEAPON3': '0.900', 'weapon3': '1.134', 'weapon2': '1.308', 'FRAGCOUNT': '2.000'} [2024-08-05 06:20:50,830][00139] DAMAGECOUNT value on done: 13416.0 [2024-08-05 06:20:50,831][00139] Sum rewards: -5.212, reward structure: {'DEATHCOUNT': '-11.250', 'HEALTH': '-1.058', 'AMMO5': '0.010', 'weapon4': '0.014', 'weapon5': '0.016', 'AMMO2': '0.019', 'WEAPON1': '0.020', 'ARMOR': '0.036', 'HITCOUNT': '0.090', 'AMMO4': '0.093', 'WEAPON4': '0.100', 'AMMO3': '0.192', 'WEAPON5': '0.200', 'DAMAGECOUNT': '0.456', 'WEAPON3': '1.050', 'weapon2': '1.180', 'weapon3': '1.620', 'FRAGCOUNT': '2.000'} [2024-08-05 06:20:55,483][00034] Fps is (10 sec: 1638.4, 60 sec: 1092.3, 300 sec: 1166.3). Total num frames: 2555904. Throughput: 0: 283.8. Samples: 639629. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 06:20:55,484][00034] Avg episode reward: [(0, '-5.493')] [2024-08-05 06:20:55,491][00132] Saving new best policy, reward=-5.493! [2024-08-05 06:21:00,483][00034] Fps is (10 sec: 1638.4, 60 sec: 1228.8, 300 sec: 1166.3). Total num frames: 2564096. Throughput: 0: 284.2. Samples: 640485. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 06:21:00,484][00034] Avg episode reward: [(0, '-5.493')] [2024-08-05 06:21:05,330][00139] DAMAGECOUNT value on done: 14169.0 [2024-08-05 06:21:05,330][00139] Sum rewards: -5.894, reward structure: {'DEATHCOUNT': '-11.250', 'HEALTH': '-1.955', 'AMMO2': '0.009', 'WEAPON1': '0.020', 'AMMO5': '0.022', 'AMMO4': '0.042', 'WEAPON4': '0.050', 'weapon5': '0.068', 'HITCOUNT': '0.100', 'AMMO3': '0.214', 'WEAPON5': '0.400', 'DAMAGECOUNT': '0.720', 'weapon2': '0.868', 'WEAPON3': '1.050', 'weapon3': '1.748', 'FRAGCOUNT': '2.000'} [2024-08-05 06:21:05,483][00034] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1138.5). Total num frames: 2564096. Throughput: 0: 285.7. Samples: 642207. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 06:21:05,484][00034] Avg episode reward: [(0, '-5.516')] [2024-08-05 06:21:05,614][00139] DAMAGECOUNT value on done: 13746.0 [2024-08-05 06:21:05,614][00139] Sum rewards: -2.558, reward structure: {'DEATHCOUNT': '-7.500', 'HEALTH': '-1.880', 'weapon5': '0.006', 'AMMO5': '0.007', 'AMMO2': '0.009', 'WEAPON1': '0.010', 'AMMO4': '0.042', 'WEAPON4': '0.050', 'WEAPON5': '0.050', 'weapon7': '0.054', 'weapon4': '0.064', 'AMMO3': '0.153', 'AMMO6': '0.160', 'AMMO7': '0.160', 'HITCOUNT': '0.190', 'WEAPON7': '0.200', 'WEAPON3': '0.700', 'DAMAGECOUNT': '0.990', 'FRAGCOUNT': '1.000', 'weapon2': '1.268', 'weapon3': '1.708'} [2024-08-05 06:21:10,483][00034] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1138.5). Total num frames: 2572288. Throughput: 0: 284.6. Samples: 643855. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 06:21:10,485][00034] Avg episode reward: [(0, '-5.487')] [2024-08-05 06:21:10,487][00132] Saving new best policy, reward=-5.487! [2024-08-05 06:21:15,483][00034] Fps is (10 sec: 1638.4, 60 sec: 1228.8, 300 sec: 1166.3). Total num frames: 2580480. Throughput: 0: 284.6. Samples: 644732. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 06:21:15,484][00034] Avg episode reward: [(0, '-5.487')] [2024-08-05 06:21:20,129][00139] DAMAGECOUNT value on done: 14284.0 [2024-08-05 06:21:20,129][00139] Sum rewards: -6.468, reward structure: {'DEATHCOUNT': '-8.250', 'FRAGCOUNT': '-1.500', 'HEALTH': '-1.464', 'AMMO5': '0.005', 'weapon5': '0.024', 'AMMO2': '0.036', 'ARMOR': '0.068', 'HITCOUNT': '0.100', 'WEAPON5': '0.100', 'AMMO3': '0.110', 'weapon4': '0.118', 'AMMO4': '0.180', 'DAMAGECOUNT': '0.345', 'WEAPON4': '0.350', 'WEAPON3': '0.550', 'weapon2': '1.286', 'weapon3': '1.474'} [2024-08-05 06:21:20,361][00139] DAMAGECOUNT value on done: 13912.0 [2024-08-05 06:21:20,362][00139] Sum rewards: -3.008, reward structure: {'DEATHCOUNT': '-8.250', 'HEALTH': '-1.310', 'AMMO2': '0.004', 'AMMO5': '0.015', 'AMMO4': '0.022', 'weapon5': '0.054', 'HITCOUNT': '0.110', 'AMMO3': '0.118', 'WEAPON5': '0.250', 'DAMAGECOUNT': '0.498', 'WEAPON3': '0.700', 'weapon2': '1.262', 'weapon3': '1.518', 'FRAGCOUNT': '2.000'} [2024-08-05 06:21:20,483][00034] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1138.5). Total num frames: 2580480. Throughput: 0: 285.8. Samples: 646439. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 06:21:20,484][00034] Avg episode reward: [(0, '-5.394')] [2024-08-05 06:21:20,486][00132] Saving new best policy, reward=-5.394! [2024-08-05 06:21:25,483][00034] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1138.5). Total num frames: 2588672. Throughput: 0: 285.5. Samples: 648159. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 06:21:25,484][00034] Avg episode reward: [(0, '-5.394')] [2024-08-05 06:21:30,483][00034] Fps is (10 sec: 1638.4, 60 sec: 1228.8, 300 sec: 1166.3). Total num frames: 2596864. Throughput: 0: 284.8. Samples: 649026. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 06:21:30,485][00034] Avg episode reward: [(0, '-5.394')] [2024-08-05 06:21:34,724][00139] DAMAGECOUNT value on done: 14383.0 [2024-08-05 06:21:34,935][00139] DAMAGECOUNT value on done: 14037.0 [2024-08-05 06:21:35,483][00034] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1138.5). Total num frames: 2596864. Throughput: 0: 285.6. Samples: 650767. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 06:21:35,484][00034] Avg episode reward: [(0, '-5.294')] [2024-08-05 06:21:35,493][00132] Saving /kaggle/working/train_dir/default_experiment/checkpoint_p0/checkpoint_000000317_2596864.pth... [2024-08-05 06:21:35,565][00132] Removing /kaggle/working/train_dir/default_experiment/checkpoint_p0/checkpoint_000000284_2326528.pth [2024-08-05 06:21:35,574][00132] Saving new best policy, reward=-5.294! [2024-08-05 06:21:40,483][00034] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1138.5). Total num frames: 2605056. Throughput: 0: 284.8. Samples: 652443. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 06:21:40,484][00034] Avg episode reward: [(0, '-5.294')] [2024-08-05 06:21:45,483][00034] Fps is (10 sec: 1638.4, 60 sec: 1228.8, 300 sec: 1166.3). Total num frames: 2613248. Throughput: 0: 285.6. Samples: 653336. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 06:21:45,484][00034] Avg episode reward: [(0, '-5.294')] [2024-08-05 06:21:49,412][00139] DAMAGECOUNT value on done: 14518.0 [2024-08-05 06:21:49,632][00139] DAMAGECOUNT value on done: 14157.0 [2024-08-05 06:21:50,286][00138] Updated weights for policy 0, policy_version 320 (0.0018) [2024-08-05 06:21:50,483][00034] Fps is (10 sec: 1638.4, 60 sec: 1228.8, 300 sec: 1166.3). Total num frames: 2621440. Throughput: 0: 285.4. Samples: 655050. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 06:21:50,484][00034] Avg episode reward: [(0, '-5.321')] [2024-08-05 06:21:55,483][00034] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1138.5). Total num frames: 2621440. Throughput: 0: 286.9. Samples: 656765. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 06:21:55,484][00034] Avg episode reward: [(0, '-5.321')] [2024-08-05 06:22:00,483][00034] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1138.5). Total num frames: 2629632. Throughput: 0: 286.9. Samples: 657643. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 06:22:00,485][00034] Avg episode reward: [(0, '-5.321')] [2024-08-05 06:22:03,935][00139] DAMAGECOUNT value on done: 14543.0 [2024-08-05 06:22:04,170][00139] DAMAGECOUNT value on done: 14319.0 [2024-08-05 06:22:04,171][00139] Sum rewards: -3.030, reward structure: {'DEATHCOUNT': '-9.750', 'HEALTH': '-0.334', 'AMMO2': '0.005', 'AMMO5': '0.008', 'weapon7': '0.014', 'WEAPON1': '0.020', 'AMMO4': '0.024', 'weapon5': '0.074', 'HITCOUNT': '0.140', 'WEAPON5': '0.150', 'AMMO3': '0.152', 'AMMO6': '0.200', 'WEAPON7': '0.200', 'AMMO7': '0.200', 'ARMOR': '0.424', 'DAMAGECOUNT': '0.486', 'WEAPON3': '0.750', 'FRAGCOUNT': '1.000', 'weapon3': '1.602', 'weapon2': '1.606'} [2024-08-05 06:22:05,483][00034] Fps is (10 sec: 1638.4, 60 sec: 1228.8, 300 sec: 1166.3). Total num frames: 2637824. Throughput: 0: 287.8. Samples: 659389. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 06:22:05,484][00034] Avg episode reward: [(0, '-5.389')] [2024-08-05 06:22:10,483][00034] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1138.5). Total num frames: 2637824. Throughput: 0: 287.3. Samples: 661087. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 06:22:10,484][00034] Avg episode reward: [(0, '-5.389')] [2024-08-05 06:22:15,483][00034] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1138.5). Total num frames: 2646016. Throughput: 0: 287.8. Samples: 661975. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 06:22:15,484][00034] Avg episode reward: [(0, '-5.389')] [2024-08-05 06:22:18,369][00139] DAMAGECOUNT value on done: 14725.0 [2024-08-05 06:22:18,369][00139] Sum rewards: 0.469, reward structure: {'DEATHCOUNT': '-7.500', 'HEALTH': '-0.453', 'AMMO5': '0.005', 'weapon5': '0.006', 'AMMO2': '0.022', 'WEAPON5': '0.050', 'AMMO3': '0.104', 'AMMO4': '0.107', 'weapon4': '0.114', 'HITCOUNT': '0.140', 'WEAPON4': '0.250', 'ARMOR': '0.532', 'DAMAGECOUNT': '0.546', 'WEAPON3': '0.550', 'weapon2': '1.322', 'weapon3': '1.674', 'FRAGCOUNT': '3.000'} [2024-08-05 06:22:18,641][00139] DAMAGECOUNT value on done: 14512.0 [2024-08-05 06:22:18,642][00139] Sum rewards: -4.388, reward structure: {'DEATHCOUNT': '-7.500', 'FRAGCOUNT': '-1.500', 'HEALTH': '-0.600', 'ARMOR': '0.004', 'AMMO5': '0.005', 'AMMO2': '0.016', 'WEAPON1': '0.040', 'weapon7': '0.068', 'AMMO3': '0.078', 'AMMO4': '0.081', 'weapon5': '0.084', 'WEAPON5': '0.100', 'HITCOUNT': '0.110', 'weapon4': '0.140', 'AMMO6': '0.160', 'AMMO7': '0.160', 'WEAPON4': '0.200', 'WEAPON7': '0.200', 'WEAPON3': '0.350', 'DAMAGECOUNT': '0.579', 'weapon3': '0.944', 'weapon2': '1.892'} [2024-08-05 06:22:20,483][00034] Fps is (10 sec: 1638.4, 60 sec: 1228.8, 300 sec: 1166.3). Total num frames: 2654208. Throughput: 0: 288.0. Samples: 663728. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 06:22:20,484][00034] Avg episode reward: [(0, '-5.345')] [2024-08-05 06:22:25,483][00034] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1138.5). Total num frames: 2654208. Throughput: 0: 290.6. Samples: 665520. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 06:22:25,484][00034] Avg episode reward: [(0, '-5.345')] [2024-08-05 06:22:30,483][00034] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1138.5). Total num frames: 2662400. Throughput: 0: 290.5. Samples: 666408. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 06:22:30,485][00034] Avg episode reward: [(0, '-5.345')] [2024-08-05 06:22:32,643][00139] DAMAGECOUNT value on done: 15140.0 [2024-08-05 06:22:32,644][00139] Sum rewards: -2.711, reward structure: {'DEATHCOUNT': '-12.000', 'HEALTH': '-0.817', 'AMMO2': '0.025', 'AMMO4': '0.126', 'weapon4': '0.136', 'AMMO3': '0.169', 'WEAPON4': '0.200', 'HITCOUNT': '0.290', 'WEAPON3': '0.950', 'ARMOR': '1.021', 'DAMAGECOUNT': '1.245', 'weapon3': '1.350', 'weapon2': '1.594', 'FRAGCOUNT': '3.000'} [2024-08-05 06:22:32,869][00139] DAMAGECOUNT value on done: 14677.0 [2024-08-05 06:22:32,869][00139] Sum rewards: -3.701, reward structure: {'DEATHCOUNT': '-7.500', 'FRAGCOUNT': '-0.500', 'HEALTH': '-0.356', 'AMMO5': '0.010', 'WEAPON1': '0.020', 'AMMO2': '0.022', 'ARMOR': '0.024', 'AMMO4': '0.107', 'HITCOUNT': '0.110', 'weapon5': '0.116', 'AMMO3': '0.117', 'WEAPON4': '0.150', 'weapon4': '0.188', 'WEAPON5': '0.200', 'DAMAGECOUNT': '0.495', 'WEAPON3': '0.500', 'weapon2': '1.146', 'weapon3': '1.450'} [2024-08-05 06:22:35,483][00034] Fps is (10 sec: 1638.4, 60 sec: 1228.8, 300 sec: 1166.3). Total num frames: 2670592. Throughput: 0: 291.2. Samples: 668155. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 06:22:35,484][00034] Avg episode reward: [(0, '-5.281')] [2024-08-05 06:22:35,492][00132] Saving new best policy, reward=-5.281! [2024-08-05 06:22:40,483][00034] Fps is (10 sec: 1638.4, 60 sec: 1228.8, 300 sec: 1166.3). Total num frames: 2678784. Throughput: 0: 291.5. Samples: 669881. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 06:22:40,487][00034] Avg episode reward: [(0, '-5.281')] [2024-08-05 06:22:45,483][00034] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1138.5). Total num frames: 2678784. Throughput: 0: 291.0. Samples: 670736. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 06:22:45,486][00034] Avg episode reward: [(0, '-5.281')] [2024-08-05 06:22:47,275][00139] DAMAGECOUNT value on done: 15320.0 [2024-08-05 06:22:47,276][00139] Sum rewards: -1.183, reward structure: {'DEATHCOUNT': '-7.500', 'HEALTH': '-0.556', 'AMMO2': '0.004', 'AMMO5': '0.012', 'AMMO4': '0.021', 'WEAPON1': '0.030', 'WEAPON4': '0.050', 'weapon5': '0.116', 'AMMO3': '0.127', 'HITCOUNT': '0.150', 'WEAPON5': '0.300', 'DAMAGECOUNT': '0.540', 'WEAPON3': '0.600', 'weapon3': '1.200', 'weapon2': '1.722', 'FRAGCOUNT': '2.000'} [2024-08-05 06:22:47,501][00139] DAMAGECOUNT value on done: 15144.0 [2024-08-05 06:22:47,502][00139] Sum rewards: 0.080, reward structure: {'DEATHCOUNT': '-8.250', 'AMMO2': '0.007', 'WEAPON1': '0.010', 'AMMO5': '0.017', 'AMMO4': '0.037', 'weapon5': '0.078', 'AMMO3': '0.093', 'HITCOUNT': '0.180', 'HEALTH': '0.184', 'WEAPON5': '0.200', 'WEAPON3': '0.450', 'weapon2': '0.906', 'DAMAGECOUNT': '1.401', 'weapon3': '1.766', 'FRAGCOUNT': '3.000'} [2024-08-05 06:22:50,483][00034] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1138.5). Total num frames: 2686976. Throughput: 0: 290.6. Samples: 672466. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 06:22:50,485][00034] Avg episode reward: [(0, '-5.154')] [2024-08-05 06:22:50,486][00132] Saving new best policy, reward=-5.154! [2024-08-05 06:22:55,483][00034] Fps is (10 sec: 1638.4, 60 sec: 1228.8, 300 sec: 1166.3). Total num frames: 2695168. Throughput: 0: 292.2. Samples: 674237. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 06:22:55,485][00034] Avg episode reward: [(0, '-5.154')] [2024-08-05 06:23:00,483][00034] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1138.5). Total num frames: 2695168. Throughput: 0: 291.7. Samples: 675103. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 06:23:00,485][00034] Avg episode reward: [(0, '-5.154')] [2024-08-05 06:23:00,684][00138] Updated weights for policy 0, policy_version 330 (0.0019) [2024-08-05 06:23:01,646][00139] DAMAGECOUNT value on done: 15395.0 [2024-08-05 06:23:01,857][00139] DAMAGECOUNT value on done: 15183.0 [2024-08-05 06:23:01,858][00139] Sum rewards: -7.085, reward structure: {'DEATHCOUNT': '-11.250', 'HEALTH': '-1.622', 'AMMO5': '0.006', 'ARMOR': '0.012', 'AMMO2': '0.028', 'HITCOUNT': '0.050', 'DAMAGECOUNT': '0.117', 'weapon5': '0.120', 'AMMO4': '0.142', 'AMMO3': '0.181', 'WEAPON5': '0.200', 'weapon4': '0.232', 'WEAPON4': '0.350', 'WEAPON3': '0.850', 'FRAGCOUNT': '1.000', 'weapon3': '1.150', 'weapon2': '1.348'} [2024-08-05 06:23:05,483][00034] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1138.5). Total num frames: 2703360. Throughput: 0: 291.8. Samples: 676860. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 06:23:05,485][00034] Avg episode reward: [(0, '-5.170')] [2024-08-05 06:23:10,483][00034] Fps is (10 sec: 1638.4, 60 sec: 1228.8, 300 sec: 1166.3). Total num frames: 2711552. Throughput: 0: 290.8. Samples: 678604. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 06:23:10,485][00034] Avg episode reward: [(0, '-5.170')] [2024-08-05 06:23:15,483][00034] Fps is (10 sec: 1638.4, 60 sec: 1228.8, 300 sec: 1166.3). Total num frames: 2719744. Throughput: 0: 289.2. Samples: 679424. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 06:23:15,484][00034] Avg episode reward: [(0, '-5.170')] [2024-08-05 06:23:16,347][00139] DAMAGECOUNT value on done: 15515.0 [2024-08-05 06:23:16,575][00139] DAMAGECOUNT value on done: 15188.0 [2024-08-05 06:23:16,576][00139] Sum rewards: -6.779, reward structure: {'DEATHCOUNT': '-9.000', 'FRAGCOUNT': '-1.500', 'HEALTH': '-0.451', 'ARMOR': '0.008', 'HITCOUNT': '0.010', 'AMMO5': '0.011', 'DAMAGECOUNT': '0.015', 'AMMO2': '0.040', 'weapon4': '0.056', 'weapon5': '0.066', 'AMMO3': '0.127', 'AMMO4': '0.200', 'WEAPON5': '0.200', 'WEAPON4': '0.250', 'WEAPON3': '0.750', 'weapon3': '1.180', 'weapon2': '1.260'} [2024-08-05 06:23:20,483][00034] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1138.5). Total num frames: 2719744. Throughput: 0: 289.1. Samples: 681163. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 06:23:20,485][00034] Avg episode reward: [(0, '-5.132')] [2024-08-05 06:23:20,486][00132] Saving new best policy, reward=-5.132! [2024-08-05 06:23:25,483][00034] Fps is (10 sec: 819.2, 60 sec: 1228.8, 300 sec: 1166.3). Total num frames: 2727936. Throughput: 0: 289.1. Samples: 682890. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 06:23:25,485][00034] Avg episode reward: [(0, '-5.132')] [2024-08-05 06:23:30,483][00034] Fps is (10 sec: 1638.4, 60 sec: 1228.8, 300 sec: 1166.3). Total num frames: 2736128. Throughput: 0: 289.4. Samples: 683757. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 06:23:30,484][00034] Avg episode reward: [(0, '-5.132')] [2024-08-05 06:23:30,903][00139] DAMAGECOUNT value on done: 15655.0 [2024-08-05 06:23:30,903][00139] Sum rewards: -9.273, reward structure: {'DEATHCOUNT': '-12.000', 'HEALTH': '-1.818', 'FRAGCOUNT': '-0.500', 'AMMO5': '0.007', 'WEAPON1': '0.010', 'AMMO2': '0.021', 'weapon4': '0.046', 'ARMOR': '0.052', 'weapon5': '0.064', 'HITCOUNT': '0.090', 'AMMO4': '0.104', 'WEAPON5': '0.150', 'WEAPON4': '0.200', 'AMMO3': '0.215', 'DAMAGECOUNT': '0.420', 'WEAPON3': '0.950', 'weapon2': '1.334', 'weapon3': '1.382'} [2024-08-05 06:23:31,115][00139] DAMAGECOUNT value on done: 15388.0 [2024-08-05 06:23:31,115][00139] Sum rewards: -1.705, reward structure: {'DEATHCOUNT': '-6.000', 'HEALTH': '-1.430', 'AMMO2': '0.025', 'WEAPON1': '0.040', 'ARMOR': '0.055', 'weapon4': '0.072', 'AMMO3': '0.103', 'AMMO4': '0.124', 'HITCOUNT': '0.140', 'WEAPON4': '0.300', 'DAMAGECOUNT': '0.600', 'WEAPON3': '0.650', 'FRAGCOUNT': '1.000', 'weapon2': '1.210', 'weapon3': '1.406'} [2024-08-05 06:23:35,483][00034] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1138.5). Total num frames: 2736128. Throughput: 0: 289.7. Samples: 685501. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 06:23:35,484][00034] Avg episode reward: [(0, '-5.143')] [2024-08-05 06:23:35,494][00132] Saving /kaggle/working/train_dir/default_experiment/checkpoint_p0/checkpoint_000000334_2736128.pth... [2024-08-05 06:23:35,567][00132] Removing /kaggle/working/train_dir/default_experiment/checkpoint_p0/checkpoint_000000301_2465792.pth [2024-08-05 06:23:40,483][00034] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1138.5). Total num frames: 2744320. Throughput: 0: 288.3. Samples: 687210. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 06:23:40,484][00034] Avg episode reward: [(0, '-5.143')] [2024-08-05 06:23:45,483][00034] Fps is (10 sec: 1638.4, 60 sec: 1228.8, 300 sec: 1166.3). Total num frames: 2752512. Throughput: 0: 288.0. Samples: 688062. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 06:23:45,485][00034] Avg episode reward: [(0, '-5.143')] [2024-08-05 06:23:45,542][00139] DAMAGECOUNT value on done: 15804.0 [2024-08-05 06:23:45,543][00139] Sum rewards: -4.064, reward structure: {'DEATHCOUNT': '-8.250', 'HEALTH': '-1.486', 'weapon4': '0.006', 'AMMO5': '0.008', 'ARMOR': '0.012', 'AMMO2': '0.020', 'AMMO4': '0.099', 'WEAPON5': '0.100', 'weapon5': '0.104', 'HITCOUNT': '0.110', 'AMMO3': '0.126', 'WEAPON4': '0.200', 'DAMAGECOUNT': '0.447', 'WEAPON3': '0.650', 'FRAGCOUNT': '1.000', 'weapon2': '1.316', 'weapon3': '1.474'} [2024-08-05 06:23:45,772][00139] DAMAGECOUNT value on done: 15393.0 [2024-08-05 06:23:50,483][00034] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1138.5). Total num frames: 2752512. Throughput: 0: 287.0. Samples: 689775. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 06:23:50,484][00034] Avg episode reward: [(0, '-5.126')] [2024-08-05 06:23:50,507][00132] Saving new best policy, reward=-5.126! [2024-08-05 06:23:55,483][00034] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1138.5). Total num frames: 2760704. Throughput: 0: 288.2. Samples: 691572. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 06:23:55,484][00034] Avg episode reward: [(0, '-5.126')] [2024-08-05 06:23:59,945][00139] DAMAGECOUNT value on done: 15819.0 [2024-08-05 06:24:00,191][00139] DAMAGECOUNT value on done: 15532.0 [2024-08-05 06:24:00,191][00139] Sum rewards: -3.634, reward structure: {'DEATHCOUNT': '-9.750', 'HEALTH': '-0.584', 'AMMO2': '0.007', 'ARMOR': '0.008', 'AMMO5': '0.009', 'weapon7': '0.024', 'weapon4': '0.030', 'AMMO4': '0.035', 'WEAPON4': '0.050', 'HITCOUNT': '0.090', 'AMMO6': '0.100', 'WEAPON7': '0.100', 'AMMO7': '0.100', 'weapon5': '0.132', 'AMMO3': '0.139', 'WEAPON5': '0.200', 'DAMAGECOUNT': '0.417', 'WEAPON3': '0.700', 'weapon2': '1.136', 'weapon3': '1.422', 'FRAGCOUNT': '2.000'} [2024-08-05 06:24:00,483][00034] Fps is (10 sec: 1638.4, 60 sec: 1228.8, 300 sec: 1166.3). Total num frames: 2768896. Throughput: 0: 288.6. Samples: 692409. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 06:24:00,484][00034] Avg episode reward: [(0, '-5.125')] [2024-08-05 06:24:00,486][00132] Saving new best policy, reward=-5.125! [2024-08-05 06:24:05,483][00034] Fps is (10 sec: 1638.4, 60 sec: 1228.8, 300 sec: 1166.3). Total num frames: 2777088. Throughput: 0: 289.4. Samples: 694184. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 06:24:05,484][00034] Avg episode reward: [(0, '-5.125')] [2024-08-05 06:24:10,483][00034] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1138.5). Total num frames: 2777088. Throughput: 0: 289.8. Samples: 695930. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 06:24:10,484][00034] Avg episode reward: [(0, '-5.125')] [2024-08-05 06:24:11,500][00138] Updated weights for policy 0, policy_version 340 (0.0018) [2024-08-05 06:24:14,306][00139] DAMAGECOUNT value on done: 16117.0 [2024-08-05 06:24:14,307][00139] Sum rewards: -7.676, reward structure: {'DEATHCOUNT': '-11.250', 'HEALTH': '-1.314', 'FRAGCOUNT': '-0.500', 'AMMO5': '0.015', 'ARMOR': '0.020', 'AMMO2': '0.028', 'WEAPON1': '0.030', 'weapon5': '0.062', 'HITCOUNT': '0.130', 'AMMO4': '0.138', 'weapon4': '0.146', 'AMMO3': '0.165', 'WEAPON5': '0.200', 'WEAPON4': '0.300', 'WEAPON3': '0.800', 'DAMAGECOUNT': '0.894', 'weapon3': '1.212', 'weapon2': '1.248'} [2024-08-05 06:24:14,581][00139] DAMAGECOUNT value on done: 15597.0 [2024-08-05 06:24:14,582][00139] Sum rewards: -3.919, reward structure: {'DEATHCOUNT': '-6.750', 'HEALTH': '-0.900', 'FRAGCOUNT': '-0.500', 'AMMO2': '0.012', 'AMMO5': '0.025', 'WEAPON1': '0.040', 'weapon5': '0.040', 'AMMO4': '0.060', 'HITCOUNT': '0.070', 'AMMO3': '0.118', 'DAMAGECOUNT': '0.195', 'WEAPON5': '0.400', 'WEAPON3': '0.700', 'weapon2': '0.894', 'weapon3': '1.676'} [2024-08-05 06:24:15,483][00034] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1138.5). Total num frames: 2785280. Throughput: 0: 289.8. Samples: 696798. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 06:24:15,485][00034] Avg episode reward: [(0, '-5.084')] [2024-08-05 06:24:15,493][00132] Saving new best policy, reward=-5.084! [2024-08-05 06:24:20,483][00034] Fps is (10 sec: 1638.4, 60 sec: 1228.8, 300 sec: 1166.3). Total num frames: 2793472. Throughput: 0: 288.4. Samples: 698480. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 06:24:20,484][00034] Avg episode reward: [(0, '-5.084')] [2024-08-05 06:24:25,483][00034] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1138.5). Total num frames: 2793472. Throughput: 0: 288.7. Samples: 700201. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 06:24:25,484][00034] Avg episode reward: [(0, '-5.084')] [2024-08-05 06:24:29,052][00139] DAMAGECOUNT value on done: 16142.0 [2024-08-05 06:24:29,292][00139] DAMAGECOUNT value on done: 15788.0 [2024-08-05 06:24:30,483][00034] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1138.5). Total num frames: 2801664. Throughput: 0: 288.8. Samples: 701056. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 06:24:30,484][00034] Avg episode reward: [(0, '-5.046')] [2024-08-05 06:24:30,486][00132] Saving new best policy, reward=-5.046! [2024-08-05 06:24:35,483][00034] Fps is (10 sec: 1638.4, 60 sec: 1228.8, 300 sec: 1166.3). Total num frames: 2809856. Throughput: 0: 288.9. Samples: 702776. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 06:24:35,484][00034] Avg episode reward: [(0, '-5.046')] [2024-08-05 06:24:36,730][00139] Large shaping reward -2.549 for [('FRAGCOUNT', -1.5, -1.0), ('DEATHCOUNT', -0.75, 1.0), ('HEALTH', -0.3, -100.0), ('AMMO5', -0.0005, -1.0), ('weapon5', 0.002)] [2024-08-05 06:24:40,483][00034] Fps is (10 sec: 1638.4, 60 sec: 1228.8, 300 sec: 1166.3). Total num frames: 2818048. Throughput: 0: 287.3. Samples: 704501. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 06:24:40,484][00034] Avg episode reward: [(0, '-5.046')] [2024-08-05 06:24:43,592][00139] DAMAGECOUNT value on done: 16181.0 [2024-08-05 06:24:43,593][00139] Sum rewards: -7.638, reward structure: {'DEATHCOUNT': '-9.750', 'FRAGCOUNT': '-1.500', 'HEALTH': '-1.039', 'AMMO5': '0.015', 'AMMO2': '0.017', 'weapon4': '0.018', 'HITCOUNT': '0.030', 'AMMO4': '0.085', 'weapon5': '0.094', 'DAMAGECOUNT': '0.117', 'AMMO3': '0.146', 'WEAPON4': '0.200', 'WEAPON5': '0.350', 'WEAPON3': '0.850', 'weapon2': '1.216', 'weapon3': '1.512'} [2024-08-05 06:24:43,819][00139] DAMAGECOUNT value on done: 15868.0 [2024-08-05 06:24:43,819][00139] Sum rewards: -0.770, reward structure: {'DEATHCOUNT': '-6.000', 'HEALTH': '-0.626', 'AMMO5': '0.003', 'AMMO2': '0.020', 'WEAPON1': '0.020', 'ARMOR': '0.032', 'WEAPON5': '0.050', 'HITCOUNT': '0.070', 'AMMO4': '0.098', 'WEAPON4': '0.100', 'weapon4': '0.106', 'AMMO3': '0.116', 'AMMO6': '0.200', 'WEAPON7': '0.200', 'AMMO7': '0.200', 'DAMAGECOUNT': '0.240', 'WEAPON3': '0.650', 'weapon2': '0.876', 'FRAGCOUNT': '1.000', 'weapon3': '1.876'} [2024-08-05 06:24:45,483][00034] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1138.5). Total num frames: 2818048. Throughput: 0: 288.4. Samples: 705388. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 06:24:45,484][00034] Avg episode reward: [(0, '-4.973')] [2024-08-05 06:24:45,492][00132] Saving new best policy, reward=-4.973! [2024-08-05 06:24:50,483][00034] Fps is (10 sec: 819.2, 60 sec: 1228.8, 300 sec: 1138.5). Total num frames: 2826240. Throughput: 0: 287.3. Samples: 707113. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 06:24:50,484][00034] Avg episode reward: [(0, '-4.973')] [2024-08-05 06:24:55,483][00034] Fps is (10 sec: 1638.4, 60 sec: 1228.8, 300 sec: 1166.3). Total num frames: 2834432. Throughput: 0: 287.8. Samples: 708880. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 06:24:55,484][00034] Avg episode reward: [(0, '-4.973')] [2024-08-05 06:24:57,991][00139] DAMAGECOUNT value on done: 16459.0 [2024-08-05 06:24:57,992][00139] Sum rewards: -4.491, reward structure: {'DEATHCOUNT': '-11.250', 'HEALTH': '-1.544', 'weapon5': '0.002', 'AMMO5': '0.007', 'AMMO2': '0.021', 'ARMOR': '0.036', 'WEAPON5': '0.100', 'AMMO4': '0.103', 'weapon4': '0.134', 'AMMO3': '0.150', 'HITCOUNT': '0.230', 'WEAPON4': '0.250', 'WEAPON3': '0.800', 'DAMAGECOUNT': '0.834', 'weapon2': '1.224', 'weapon3': '1.412', 'FRAGCOUNT': '3.000'} [2024-08-05 06:24:58,210][00139] DAMAGECOUNT value on done: 15996.0 [2024-08-05 06:24:58,210][00139] Sum rewards: -3.684, reward structure: {'DEATHCOUNT': '-9.000', 'HEALTH': '-0.211', 'weapon4': '0.012', 'AMMO5': '0.012', 'WEAPON1': '0.020', 'AMMO2': '0.021', 'weapon5': '0.052', 'ARMOR': '0.068', 'AMMO3': '0.103', 'AMMO4': '0.105', 'HITCOUNT': '0.110', 'WEAPON5': '0.200', 'WEAPON4': '0.200', 'DAMAGECOUNT': '0.384', 'WEAPON3': '0.600', 'FRAGCOUNT': '1.000', 'weapon2': '1.242', 'weapon3': '1.398'} [2024-08-05 06:25:00,483][00034] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1138.5). Total num frames: 2834432. Throughput: 0: 288.4. Samples: 709776. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 06:25:00,484][00034] Avg episode reward: [(0, '-4.915')] [2024-08-05 06:25:00,486][00132] Saving new best policy, reward=-4.915! [2024-08-05 06:25:05,483][00034] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1138.5). Total num frames: 2842624. Throughput: 0: 289.9. Samples: 711526. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 06:25:05,484][00034] Avg episode reward: [(0, '-4.915')] [2024-08-05 06:25:10,483][00034] Fps is (10 sec: 1638.4, 60 sec: 1228.8, 300 sec: 1166.3). Total num frames: 2850816. Throughput: 0: 290.5. Samples: 713274. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 06:25:10,484][00034] Avg episode reward: [(0, '-4.915')] [2024-08-05 06:25:12,378][00139] DAMAGECOUNT value on done: 16542.0 [2024-08-05 06:25:12,379][00139] Sum rewards: -5.048, reward structure: {'DEATHCOUNT': '-9.000', 'FRAGCOUNT': '-0.500', 'AMMO5': '0.010', 'WEAPON1': '0.010', 'HEALTH': '0.014', 'AMMO2': '0.018', 'HITCOUNT': '0.030', 'weapon4': '0.058', 'weapon5': '0.058', 'ARMOR': '0.088', 'AMMO4': '0.089', 'WEAPON4': '0.100', 'AMMO3': '0.132', 'WEAPON5': '0.150', 'DAMAGECOUNT': '0.249', 'WEAPON3': '0.600', 'weapon2': '1.146', 'weapon3': '1.700'} [2024-08-05 06:25:12,602][00139] DAMAGECOUNT value on done: 16251.0 [2024-08-05 06:25:12,602][00139] Sum rewards: -1.861, reward structure: {'DEATHCOUNT': '-8.250', 'HEALTH': '-1.668', 'AMMO2': '0.003', 'AMMO5': '0.007', 'WEAPON1': '0.010', 'AMMO4': '0.012', 'weapon5': '0.018', 'ARMOR': '0.036', 'AMMO3': '0.138', 'WEAPON5': '0.150', 'HITCOUNT': '0.170', 'DAMAGECOUNT': '0.765', 'WEAPON3': '0.850', 'weapon2': '1.172', 'weapon3': '1.726', 'FRAGCOUNT': '3.000'} [2024-08-05 06:25:15,483][00034] Fps is (10 sec: 1638.4, 60 sec: 1228.8, 300 sec: 1166.3). Total num frames: 2859008. Throughput: 0: 291.0. Samples: 714153. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 06:25:15,485][00034] Avg episode reward: [(0, '-4.910')] [2024-08-05 06:25:15,492][00132] Saving new best policy, reward=-4.910! [2024-08-05 06:25:20,483][00034] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1138.5). Total num frames: 2859008. Throughput: 0: 290.2. Samples: 715837. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 06:25:20,485][00034] Avg episode reward: [(0, '-4.910')] [2024-08-05 06:25:22,479][00138] Updated weights for policy 0, policy_version 350 (0.0017) [2024-08-05 06:25:25,483][00034] Fps is (10 sec: 819.2, 60 sec: 1228.8, 300 sec: 1166.3). Total num frames: 2867200. Throughput: 0: 291.0. Samples: 717598. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 06:25:25,484][00034] Avg episode reward: [(0, '-4.910')] [2024-08-05 06:25:27,003][00139] DAMAGECOUNT value on done: 16737.0 [2024-08-05 06:25:27,004][00139] Sum rewards: -4.372, reward structure: {'DEATHCOUNT': '-9.000', 'HEALTH': '-1.211', 'AMMO2': '0.009', 'AMMO5': '0.013', 'weapon5': '0.030', 'AMMO4': '0.043', 'WEAPON4': '0.050', 'weapon4': '0.076', 'ARMOR': '0.092', 'HITCOUNT': '0.140', 'AMMO3': '0.152', 'WEAPON5': '0.250', 'DAMAGECOUNT': '0.585', 'WEAPON3': '0.800', 'FRAGCOUNT': '1.000', 'weapon3': '1.206', 'weapon2': '1.394'} [2024-08-05 06:25:27,229][00139] DAMAGECOUNT value on done: 16328.0 [2024-08-05 06:25:30,483][00034] Fps is (10 sec: 1638.4, 60 sec: 1228.8, 300 sec: 1166.3). Total num frames: 2875392. Throughput: 0: 290.8. Samples: 718475. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 06:25:30,484][00034] Avg episode reward: [(0, '-4.858')] [2024-08-05 06:25:30,486][00132] Saving new best policy, reward=-4.858! [2024-08-05 06:25:35,483][00034] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1138.5). Total num frames: 2875392. Throughput: 0: 290.6. Samples: 720188. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 06:25:35,484][00034] Avg episode reward: [(0, '-4.858')] [2024-08-05 06:25:35,492][00132] Saving /kaggle/working/train_dir/default_experiment/checkpoint_p0/checkpoint_000000351_2875392.pth... [2024-08-05 06:25:35,558][00132] Removing /kaggle/working/train_dir/default_experiment/checkpoint_p0/checkpoint_000000317_2596864.pth [2024-08-05 06:25:40,483][00034] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1166.3). Total num frames: 2883584. Throughput: 0: 289.4. Samples: 721904. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 06:25:40,484][00034] Avg episode reward: [(0, '-4.858')] [2024-08-05 06:25:41,621][00139] DAMAGECOUNT value on done: 16922.0 [2024-08-05 06:25:41,621][00139] Sum rewards: -3.734, reward structure: {'DEATHCOUNT': '-9.750', 'HEALTH': '-0.700', 'AMMO2': '0.012', 'weapon4': '0.026', 'AMMO4': '0.060', 'WEAPON4': '0.100', 'HITCOUNT': '0.140', 'AMMO3': '0.147', 'DAMAGECOUNT': '0.555', 'WEAPON3': '0.800', 'weapon2': '1.152', 'weapon3': '1.724', 'FRAGCOUNT': '2.000'} [2024-08-05 06:25:41,852][00139] DAMAGECOUNT value on done: 16408.0 [2024-08-05 06:25:41,852][00139] Sum rewards: -7.197, reward structure: {'DEATHCOUNT': '-11.250', 'HEALTH': '-1.254', 'AMMO4': '-0.007', 'AMMO2': '-0.001', 'ARMOR': '0.008', 'AMMO5': '0.012', 'WEAPON1': '0.020', 'weapon5': '0.028', 'weapon4': '0.040', 'WEAPON4': '0.050', 'HITCOUNT': '0.070', 'AMMO3': '0.139', 'WEAPON5': '0.200', 'DAMAGECOUNT': '0.240', 'WEAPON3': '0.700', 'FRAGCOUNT': '1.000', 'weapon3': '1.130', 'weapon2': '1.678'} [2024-08-05 06:25:45,483][00034] Fps is (10 sec: 1638.4, 60 sec: 1228.8, 300 sec: 1166.3). Total num frames: 2891776. Throughput: 0: 289.1. Samples: 722785. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 06:25:45,484][00034] Avg episode reward: [(0, '-4.859')] [2024-08-05 06:25:50,483][00034] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1138.5). Total num frames: 2891776. Throughput: 0: 289.2. Samples: 724541. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 06:25:50,485][00034] Avg episode reward: [(0, '-4.859')] [2024-08-05 06:25:55,483][00034] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1138.5). Total num frames: 2899968. Throughput: 0: 287.1. Samples: 726192. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 06:25:55,484][00034] Avg episode reward: [(0, '-4.859')] [2024-08-05 06:25:56,276][00139] DAMAGECOUNT value on done: 17106.0 [2024-08-05 06:25:56,277][00139] Sum rewards: -7.030, reward structure: {'DEATHCOUNT': '-9.750', 'HEALTH': '-2.320', 'FRAGCOUNT': '-0.500', 'AMMO5': '0.007', 'ARMOR': '0.008', 'AMMO2': '0.017', 'WEAPON1': '0.020', 'weapon5': '0.032', 'AMMO4': '0.086', 'weapon7': '0.094', 'AMMO6': '0.120', 'AMMO7': '0.120', 'HITCOUNT': '0.140', 'WEAPON4': '0.150', 'WEAPON5': '0.150', 'AMMO3': '0.173', 'WEAPON7': '0.200', 'weapon4': '0.234', 'DAMAGECOUNT': '0.552', 'WEAPON3': '0.900', 'weapon3': '1.190', 'weapon2': '1.346'} [2024-08-05 06:25:56,493][00139] DAMAGECOUNT value on done: 16584.0 [2024-08-05 06:25:56,494][00139] Sum rewards: -4.631, reward structure: {'DEATHCOUNT': '-9.750', 'HEALTH': '-1.084', 'AMMO5': '0.010', 'WEAPON1': '0.030', 'AMMO2': '0.032', 'ARMOR': '0.056', 'AMMO3': '0.114', 'weapon4': '0.114', 'WEAPON5': '0.150', 'HITCOUNT': '0.160', 'AMMO4': '0.161', 'WEAPON4': '0.300', 'DAMAGECOUNT': '0.528', 'WEAPON3': '0.650', 'FRAGCOUNT': '1.000', 'weapon2': '1.274', 'weapon3': '1.624'} [2024-08-05 06:26:00,483][00034] Fps is (10 sec: 1638.4, 60 sec: 1228.8, 300 sec: 1166.3). Total num frames: 2908160. Throughput: 0: 286.9. Samples: 727063. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 06:26:00,484][00034] Avg episode reward: [(0, '-4.879')] [2024-08-05 06:26:05,483][00034] Fps is (10 sec: 1638.4, 60 sec: 1228.8, 300 sec: 1166.3). Total num frames: 2916352. Throughput: 0: 289.2. Samples: 728850. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 06:26:05,484][00034] Avg episode reward: [(0, '-4.879')] [2024-08-05 06:26:10,483][00034] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1138.5). Total num frames: 2916352. Throughput: 0: 287.8. Samples: 730548. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 06:26:10,485][00034] Avg episode reward: [(0, '-4.879')] [2024-08-05 06:26:10,771][00139] DAMAGECOUNT value on done: 17161.0 [2024-08-05 06:26:11,007][00139] DAMAGECOUNT value on done: 16671.0 [2024-08-05 06:26:15,483][00034] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1166.3). Total num frames: 2924544. Throughput: 0: 287.0. Samples: 731389. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 06:26:15,484][00034] Avg episode reward: [(0, '-4.902')] [2024-08-05 06:26:20,483][00034] Fps is (10 sec: 1638.4, 60 sec: 1228.8, 300 sec: 1166.3). Total num frames: 2932736. Throughput: 0: 287.3. Samples: 733117. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 06:26:20,485][00034] Avg episode reward: [(0, '-4.902')] [2024-08-05 06:26:25,483][00034] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1138.5). Total num frames: 2932736. Throughput: 0: 286.7. Samples: 734804. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 06:26:25,485][00034] Avg episode reward: [(0, '-4.902')] [2024-08-05 06:26:25,609][00139] DAMAGECOUNT value on done: 17226.0 [2024-08-05 06:26:25,610][00139] Sum rewards: -3.448, reward structure: {'DEATHCOUNT': '-8.250', 'HEALTH': '-0.882', 'AMMO5': '0.007', 'WEAPON1': '0.010', 'weapon5': '0.012', 'AMMO2': '0.022', 'ARMOR': '0.048', 'HITCOUNT': '0.060', 'weapon4': '0.066', 'AMMO4': '0.107', 'AMMO3': '0.138', 'WEAPON5': '0.150', 'DAMAGECOUNT': '0.195', 'WEAPON4': '0.250', 'WEAPON3': '0.700', 'FRAGCOUNT': '1.000', 'weapon2': '1.388', 'weapon3': '1.530'} [2024-08-05 06:26:25,837][00139] DAMAGECOUNT value on done: 16831.0 [2024-08-05 06:26:25,837][00139] Sum rewards: -3.654, reward structure: {'DEATHCOUNT': '-9.000', 'HEALTH': '-1.488', 'AMMO5': '0.005', 'AMMO2': '0.010', 'WEAPON1': '0.010', 'weapon5': '0.012', 'weapon4': '0.020', 'ARMOR': '0.036', 'AMMO4': '0.049', 'WEAPON5': '0.050', 'WEAPON4': '0.100', 'HITCOUNT': '0.130', 'AMMO3': '0.158', 'DAMAGECOUNT': '0.480', 'WEAPON3': '0.850', 'weapon2': '1.188', 'weapon3': '1.736', 'FRAGCOUNT': '2.000'} [2024-08-05 06:26:30,483][00034] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1166.3). Total num frames: 2940928. Throughput: 0: 285.9. Samples: 735650. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 06:26:30,484][00034] Avg episode reward: [(0, '-4.878')] [2024-08-05 06:26:33,876][00138] Updated weights for policy 0, policy_version 360 (0.0018) [2024-08-05 06:26:35,483][00034] Fps is (10 sec: 1638.4, 60 sec: 1228.8, 300 sec: 1166.3). Total num frames: 2949120. Throughput: 0: 285.1. Samples: 737369. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 06:26:35,484][00034] Avg episode reward: [(0, '-4.878')] [2024-08-05 06:26:40,292][00139] DAMAGECOUNT value on done: 17301.0 [2024-08-05 06:26:40,293][00139] Sum rewards: -3.039, reward structure: {'DEATHCOUNT': '-6.750', 'HEALTH': '-1.612', 'weapon5': '0.002', 'AMMO5': '0.007', 'AMMO2': '0.012', 'WEAPON1': '0.020', 'weapon4': '0.052', 'AMMO4': '0.061', 'HITCOUNT': '0.070', 'AMMO3': '0.075', 'WEAPON4': '0.100', 'WEAPON5': '0.150', 'DAMAGECOUNT': '0.225', 'WEAPON3': '0.500', 'FRAGCOUNT': '1.000', 'weapon3': '1.464', 'weapon2': '1.584'} [2024-08-05 06:26:40,483][00034] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1138.5). Total num frames: 2949120. Throughput: 0: 286.5. Samples: 739085. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 06:26:40,485][00034] Avg episode reward: [(0, '-4.873')] [2024-08-05 06:26:40,524][00139] DAMAGECOUNT value on done: 16997.0 [2024-08-05 06:26:40,524][00139] Sum rewards: -4.778, reward structure: {'DEATHCOUNT': '-11.250', 'HEALTH': '-1.006', 'ARMOR': '0.012', 'AMMO5': '0.012', 'AMMO2': '0.013', 'weapon5': '0.018', 'weapon4': '0.032', 'AMMO4': '0.064', 'WEAPON4': '0.100', 'HITCOUNT': '0.130', 'AMMO3': '0.135', 'WEAPON5': '0.150', 'DAMAGECOUNT': '0.498', 'WEAPON3': '0.800', 'weapon2': '1.024', 'weapon3': '1.990', 'FRAGCOUNT': '2.500'} [2024-08-05 06:26:45,483][00034] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1138.5). Total num frames: 2957312. Throughput: 0: 286.2. Samples: 739940. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 06:26:45,485][00034] Avg episode reward: [(0, '-4.875')] [2024-08-05 06:26:50,483][00034] Fps is (10 sec: 1638.4, 60 sec: 1228.8, 300 sec: 1166.3). Total num frames: 2965504. Throughput: 0: 284.2. Samples: 741638. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 06:26:50,485][00034] Avg episode reward: [(0, '-4.875')] [2024-08-05 06:26:55,318][00139] DAMAGECOUNT value on done: 17566.0 [2024-08-05 06:26:55,319][00139] Sum rewards: -3.206, reward structure: {'DEATHCOUNT': '-8.250', 'HEALTH': '-1.936', 'AMMO2': '0.003', 'ARMOR': '0.012', 'AMMO4': '0.017', 'WEAPON1': '0.020', 'weapon4': '0.054', 'AMMO3': '0.095', 'HITCOUNT': '0.160', 'WEAPON4': '0.200', 'WEAPON3': '0.600', 'DAMAGECOUNT': '0.795', 'weapon3': '1.096', 'weapon2': '1.928', 'FRAGCOUNT': '2.000'} [2024-08-05 06:26:55,483][00034] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1138.5). Total num frames: 2965504. Throughput: 0: 283.0. Samples: 743285. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 06:26:55,485][00034] Avg episode reward: [(0, '-4.916')] [2024-08-05 06:26:55,552][00139] DAMAGECOUNT value on done: 17292.0 [2024-08-05 06:26:55,552][00139] Sum rewards: -4.336, reward structure: {'DEATHCOUNT': '-12.000', 'HEALTH': '-0.634', 'AMMO2': '0.017', 'AMMO5': '0.022', 'WEAPON1': '0.030', 'ARMOR': '0.040', 'weapon5': '0.066', 'weapon4': '0.074', 'AMMO4': '0.082', 'WEAPON4': '0.100', 'AMMO3': '0.124', 'HITCOUNT': '0.240', 'WEAPON5': '0.350', 'WEAPON3': '0.450', 'DAMAGECOUNT': '0.885', 'weapon3': '1.078', 'weapon2': '1.740', 'FRAGCOUNT': '3.000'} [2024-08-05 06:27:00,483][00034] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1138.5). Total num frames: 2973696. Throughput: 0: 282.8. Samples: 744113. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 06:27:00,484][00034] Avg episode reward: [(0, '-4.916')] [2024-08-05 06:27:05,483][00034] Fps is (10 sec: 1638.4, 60 sec: 1092.3, 300 sec: 1166.3). Total num frames: 2981888. Throughput: 0: 282.6. Samples: 745836. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 06:27:05,484][00034] Avg episode reward: [(0, '-4.916')] [2024-08-05 06:27:10,145][00139] DAMAGECOUNT value on done: 17780.0 [2024-08-05 06:27:10,146][00139] Sum rewards: -3.981, reward structure: {'DEATHCOUNT': '-9.000', 'HEALTH': '-2.094', 'ARMOR': '0.004', 'AMMO5': '0.005', 'weapon4': '0.012', 'WEAPON1': '0.020', 'AMMO2': '0.020', 'weapon5': '0.026', 'WEAPON5': '0.050', 'AMMO4': '0.100', 'WEAPON4': '0.100', 'AMMO3': '0.176', 'HITCOUNT': '0.190', 'DAMAGECOUNT': '0.642', 'WEAPON3': '1.000', 'weapon3': '1.268', 'weapon2': '1.500', 'FRAGCOUNT': '2.000'} [2024-08-05 06:27:10,384][00139] DAMAGECOUNT value on done: 17471.0 [2024-08-05 06:27:10,384][00139] Sum rewards: -3.608, reward structure: {'DEATHCOUNT': '-8.250', 'HEALTH': '-1.885', 'weapon5': '0.012', 'AMMO2': '0.016', 'AMMO5': '0.028', 'ARMOR': '0.040', 'WEAPON1': '0.040', 'WEAPON4': '0.050', 'AMMO4': '0.078', 'HITCOUNT': '0.130', 'AMMO3': '0.164', 'WEAPON5': '0.450', 'DAMAGECOUNT': '0.537', 'WEAPON3': '0.950', 'FRAGCOUNT': '1.000', 'weapon2': '1.306', 'weapon3': '1.726'} [2024-08-05 06:27:10,483][00034] Fps is (10 sec: 1638.4, 60 sec: 1228.8, 300 sec: 1166.3). Total num frames: 2990080. Throughput: 0: 282.8. Samples: 747532. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 06:27:10,484][00034] Avg episode reward: [(0, '-4.894')] [2024-08-05 06:27:15,483][00034] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1138.5). Total num frames: 2990080. Throughput: 0: 283.2. Samples: 748394. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 06:27:15,485][00034] Avg episode reward: [(0, '-4.894')] [2024-08-05 06:27:20,483][00034] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1166.3). Total num frames: 2998272. Throughput: 0: 283.2. Samples: 750111. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 06:27:20,485][00034] Avg episode reward: [(0, '-4.894')] [2024-08-05 06:27:24,823][00139] DAMAGECOUNT value on done: 17962.0 [2024-08-05 06:27:24,824][00139] Sum rewards: -7.337, reward structure: {'DEATHCOUNT': '-11.250', 'HEALTH': '-1.132', 'FRAGCOUNT': '-0.500', 'AMMO5': '0.007', 'weapon5': '0.016', 'ARMOR': '0.032', 'AMMO2': '0.049', 'weapon4': '0.122', 'HITCOUNT': '0.140', 'WEAPON5': '0.150', 'AMMO3': '0.163', 'AMMO4': '0.245', 'WEAPON4': '0.400', 'DAMAGECOUNT': '0.546', 'WEAPON3': '0.850', 'weapon3': '1.286', 'weapon2': '1.538'} [2024-08-05 06:27:25,073][00139] DAMAGECOUNT value on done: 17606.0 [2024-08-05 06:27:25,483][00034] Fps is (10 sec: 1638.4, 60 sec: 1228.8, 300 sec: 1166.3). Total num frames: 3006464. Throughput: 0: 282.8. Samples: 751810. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 06:27:25,484][00034] Avg episode reward: [(0, '-4.892')] [2024-08-05 06:27:30,483][00034] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1138.5). Total num frames: 3006464. Throughput: 0: 283.1. Samples: 752681. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 06:27:30,484][00034] Avg episode reward: [(0, '-4.892')] [2024-08-05 06:27:35,483][00034] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1138.5). Total num frames: 3014656. Throughput: 0: 283.1. Samples: 754378. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 06:27:35,485][00034] Avg episode reward: [(0, '-4.892')] [2024-08-05 06:27:35,492][00132] Saving /kaggle/working/train_dir/default_experiment/checkpoint_p0/checkpoint_000000368_3014656.pth... [2024-08-05 06:27:35,575][00132] Removing /kaggle/working/train_dir/default_experiment/checkpoint_p0/checkpoint_000000334_2736128.pth [2024-08-05 06:27:39,566][00139] DAMAGECOUNT value on done: 18042.0 [2024-08-05 06:27:39,566][00139] Sum rewards: -5.870, reward structure: {'DEATHCOUNT': '-10.500', 'HEALTH': '-1.296', 'AMMO5': '0.007', 'weapon4': '0.030', 'AMMO2': '0.031', 'ARMOR': '0.032', 'weapon5': '0.046', 'HITCOUNT': '0.080', 'AMMO3': '0.138', 'WEAPON5': '0.150', 'AMMO4': '0.154', 'WEAPON4': '0.200', 'DAMAGECOUNT': '0.240', 'WEAPON3': '0.750', 'FRAGCOUNT': '1.000', 'weapon2': '1.534', 'weapon3': '1.534'} [2024-08-05 06:27:39,804][00139] DAMAGECOUNT value on done: 17671.0 [2024-08-05 06:27:39,804][00139] Sum rewards: -1.925, reward structure: {'DEATHCOUNT': '-6.750', 'HEALTH': '-0.958', 'weapon5': '0.002', 'AMMO5': '0.005', 'AMMO2': '0.008', 'weapon4': '0.010', 'AMMO4': '0.038', 'WEAPON5': '0.050', 'WEAPON4': '0.050', 'HITCOUNT': '0.060', 'AMMO3': '0.135', 'DAMAGECOUNT': '0.195', 'ARMOR': '0.480', 'WEAPON3': '0.700', 'FRAGCOUNT': '1.000', 'weapon2': '1.176', 'weapon3': '1.874'} [2024-08-05 06:27:40,483][00034] Fps is (10 sec: 1638.4, 60 sec: 1228.8, 300 sec: 1166.3). Total num frames: 3022848. Throughput: 0: 284.8. Samples: 756100. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 06:27:40,484][00034] Avg episode reward: [(0, '-4.788')] [2024-08-05 06:27:40,486][00132] Saving new best policy, reward=-4.788! [2024-08-05 06:27:45,483][00034] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1138.5). Total num frames: 3022848. Throughput: 0: 286.0. Samples: 756985. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 06:27:45,484][00034] Avg episode reward: [(0, '-4.788')] [2024-08-05 06:27:45,864][00138] Updated weights for policy 0, policy_version 370 (0.0017) [2024-08-05 06:27:50,483][00034] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1138.5). Total num frames: 3031040. Throughput: 0: 286.3. Samples: 758721. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 06:27:50,484][00034] Avg episode reward: [(0, '-4.788')] [2024-08-05 06:27:54,030][00139] DAMAGECOUNT value on done: 18172.0 [2024-08-05 06:27:54,030][00139] Sum rewards: -4.344, reward structure: {'DEATHCOUNT': '-9.750', 'HEALTH': '-0.904', 'weapon5': '0.004', 'AMMO2': '0.023', 'AMMO5': '0.028', 'HITCOUNT': '0.080', 'AMMO4': '0.114', 'AMMO3': '0.149', 'WEAPON4': '0.200', 'weapon4': '0.238', 'WEAPON5': '0.300', 'DAMAGECOUNT': '0.390', 'ARMOR': '0.444', 'WEAPON3': '0.700', 'FRAGCOUNT': '1.000', 'weapon3': '1.306', 'weapon2': '1.334'} [2024-08-05 06:27:54,258][00139] DAMAGECOUNT value on done: 17766.0 [2024-08-05 06:27:54,258][00139] Sum rewards: -6.467, reward structure: {'DEATHCOUNT': '-10.500', 'HEALTH': '-1.880', 'AMMO5': '0.005', 'weapon5': '0.020', 'AMMO2': '0.032', 'ARMOR': '0.048', 'HITCOUNT': '0.090', 'WEAPON5': '0.100', 'weapon4': '0.108', 'AMMO4': '0.158', 'AMMO3': '0.174', 'DAMAGECOUNT': '0.285', 'WEAPON4': '0.300', 'WEAPON3': '0.700', 'FRAGCOUNT': '1.000', 'weapon3': '1.040', 'weapon2': '1.854'} [2024-08-05 06:27:55,483][00034] Fps is (10 sec: 1638.4, 60 sec: 1228.8, 300 sec: 1166.3). Total num frames: 3039232. Throughput: 0: 287.0. Samples: 760446. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 06:27:55,485][00034] Avg episode reward: [(0, '-4.800')] [2024-08-05 06:28:00,483][00034] Fps is (10 sec: 1638.4, 60 sec: 1228.8, 300 sec: 1166.3). Total num frames: 3047424. Throughput: 0: 286.0. Samples: 761263. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 06:28:00,485][00034] Avg episode reward: [(0, '-4.800')] [2024-08-05 06:28:05,483][00034] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1138.5). Total num frames: 3047424. Throughput: 0: 285.3. Samples: 762948. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 06:28:05,485][00034] Avg episode reward: [(0, '-4.800')] [2024-08-05 06:28:08,801][00139] DAMAGECOUNT value on done: 18673.0 [2024-08-05 06:28:08,802][00139] Sum rewards: 0.847, reward structure: {'DEATHCOUNT': '-7.500', 'HEALTH': '-2.040', 'AMMO2': '0.002', 'AMMO4': '0.009', 'AMMO5': '0.012', 'WEAPON1': '0.020', 'weapon4': '0.026', 'weapon5': '0.068', 'WEAPON4': '0.100', 'AMMO3': '0.110', 'HITCOUNT': '0.260', 'WEAPON5': '0.300', 'ARMOR': '0.531', 'WEAPON3': '0.750', 'weapon2': '1.064', 'DAMAGECOUNT': '1.503', 'weapon3': '1.632', 'FRAGCOUNT': '4.000'} [2024-08-05 06:28:09,044][00139] DAMAGECOUNT value on done: 18278.0 [2024-08-05 06:28:09,045][00139] Sum rewards: -3.609, reward structure: {'DEATHCOUNT': '-13.500', 'HEALTH': '-1.955', 'weapon4': '0.002', 'AMMO2': '0.009', 'weapon5': '0.014', 'AMMO5': '0.025', 'AMMO4': '0.044', 'WEAPON4': '0.100', 'AMMO3': '0.224', 'WEAPON5': '0.350', 'HITCOUNT': '0.390', 'weapon2': '1.090', 'WEAPON3': '1.150', 'DAMAGECOUNT': '1.536', 'weapon3': '1.912', 'FRAGCOUNT': '5.000'} [2024-08-05 06:28:10,483][00034] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1138.5). Total num frames: 3055616. Throughput: 0: 287.3. Samples: 764739. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 06:28:10,484][00034] Avg episode reward: [(0, '-4.650')] [2024-08-05 06:28:10,488][00132] Saving new best policy, reward=-4.650! [2024-08-05 06:28:15,483][00034] Fps is (10 sec: 1638.4, 60 sec: 1228.8, 300 sec: 1166.3). Total num frames: 3063808. Throughput: 0: 287.0. Samples: 765597. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 06:28:15,485][00034] Avg episode reward: [(0, '-4.650')] [2024-08-05 06:28:20,483][00034] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1138.5). Total num frames: 3063808. Throughput: 0: 287.6. Samples: 767319. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 06:28:20,485][00034] Avg episode reward: [(0, '-4.650')] [2024-08-05 06:28:23,214][00139] DAMAGECOUNT value on done: 18803.0 [2024-08-05 06:28:23,214][00139] Sum rewards: -3.651, reward structure: {'DEATHCOUNT': '-8.250', 'HEALTH': '-1.963', 'AMMO4': '-0.022', 'AMMO2': '-0.004', 'weapon5': '0.002', 'AMMO5': '0.018', 'WEAPON1': '0.020', 'WEAPON4': '0.050', 'weapon4': '0.066', 'HITCOUNT': '0.090', 'AMMO3': '0.168', 'WEAPON5': '0.250', 'DAMAGECOUNT': '0.390', 'ARMOR': '0.562', 'WEAPON3': '0.950', 'FRAGCOUNT': '1.000', 'weapon2': '1.348', 'weapon3': '1.674'} [2024-08-05 06:28:23,424][00139] DAMAGECOUNT value on done: 18327.0 [2024-08-05 06:28:23,424][00139] Sum rewards: -4.702, reward structure: {'DEATHCOUNT': '-7.500', 'FRAGCOUNT': '-1.500', 'AMMO5': '0.003', 'weapon4': '0.014', 'AMMO2': '0.015', 'weapon5': '0.018', 'WEAPON5': '0.050', 'WEAPON4': '0.050', 'ARMOR': '0.060', 'HITCOUNT': '0.060', 'AMMO4': '0.072', 'AMMO3': '0.099', 'DAMAGECOUNT': '0.147', 'HEALTH': '0.328', 'WEAPON3': '0.550', 'weapon2': '1.312', 'weapon3': '1.520'} [2024-08-05 06:28:25,483][00034] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1138.5). Total num frames: 3072000. Throughput: 0: 288.7. Samples: 769090. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 06:28:25,485][00034] Avg episode reward: [(0, '-4.650')] [2024-08-05 06:28:30,483][00034] Fps is (10 sec: 1638.4, 60 sec: 1228.8, 300 sec: 1166.3). Total num frames: 3080192. Throughput: 0: 287.3. Samples: 769914. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 06:28:30,484][00034] Avg episode reward: [(0, '-4.650')] [2024-08-05 06:28:35,483][00034] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1138.5). Total num frames: 3080192. Throughput: 0: 286.9. Samples: 771633. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 06:28:35,484][00034] Avg episode reward: [(0, '-4.650')] [2024-08-05 06:28:37,913][00139] DAMAGECOUNT value on done: 19186.0 [2024-08-05 06:28:37,914][00139] Sum rewards: -2.511, reward structure: {'DEATHCOUNT': '-10.500', 'HEALTH': '-2.118', 'AMMO2': '0.012', 'AMMO5': '0.018', 'WEAPON1': '0.030', 'AMMO4': '0.059', 'weapon5': '0.060', 'AMMO3': '0.181', 'HITCOUNT': '0.270', 'WEAPON5': '0.300', 'WEAPON3': '1.050', 'DAMAGECOUNT': '1.149', 'weapon2': '1.456', 'weapon3': '1.522', 'FRAGCOUNT': '4.000'} [2024-08-05 06:28:38,124][00139] DAMAGECOUNT value on done: 18467.0 [2024-08-05 06:28:38,125][00139] Sum rewards: -5.635, reward structure: {'DEATHCOUNT': '-10.500', 'HEALTH': '-0.938', 'AMMO2': '0.017', 'weapon4': '0.052', 'AMMO4': '0.086', 'WEAPON4': '0.100', 'HITCOUNT': '0.110', 'ARMOR': '0.112', 'AMMO3': '0.182', 'DAMAGECOUNT': '0.420', 'WEAPON3': '0.950', 'FRAGCOUNT': '1.000', 'weapon3': '1.302', 'weapon2': '1.472'} [2024-08-05 06:28:40,483][00034] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1138.5). Total num frames: 3088384. Throughput: 0: 287.6. Samples: 773388. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 06:28:40,484][00034] Avg episode reward: [(0, '-4.696')] [2024-08-05 06:28:45,483][00034] Fps is (10 sec: 1638.4, 60 sec: 1228.8, 300 sec: 1166.3). Total num frames: 3096576. Throughput: 0: 288.7. Samples: 774254. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 06:28:45,484][00034] Avg episode reward: [(0, '-4.696')] [2024-08-05 06:28:50,483][00034] Fps is (10 sec: 1638.4, 60 sec: 1228.8, 300 sec: 1166.3). Total num frames: 3104768. Throughput: 0: 291.5. Samples: 776066. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 06:28:50,484][00034] Avg episode reward: [(0, '-4.696')] [2024-08-05 06:28:52,086][00139] DAMAGECOUNT value on done: 19226.0 [2024-08-05 06:28:52,086][00139] Sum rewards: -10.240, reward structure: {'DEATHCOUNT': '-15.000', 'HEALTH': '-1.298', 'weapon5': '0.004', 'AMMO2': '0.014', 'AMMO5': '0.017', 'HITCOUNT': '0.040', 'ARMOR': '0.040', 'AMMO4': '0.069', 'weapon4': '0.080', 'DAMAGECOUNT': '0.120', 'WEAPON4': '0.150', 'AMMO3': '0.206', 'WEAPON5': '0.250', 'FRAGCOUNT': '1.000', 'WEAPON3': '1.050', 'weapon3': '1.268', 'weapon2': '1.750'} [2024-08-05 06:28:52,322][00139] DAMAGECOUNT value on done: 18781.0 [2024-08-05 06:28:52,323][00139] Sum rewards: -2.103, reward structure: {'DEATHCOUNT': '-8.250', 'HEALTH': '-0.649', 'AMMO5': '0.003', 'WEAPON1': '0.010', 'AMMO2': '0.021', 'WEAPON5': '0.050', 'weapon4': '0.054', 'AMMO4': '0.106', 'AMMO3': '0.133', 'HITCOUNT': '0.150', 'WEAPON4': '0.150', 'ARMOR': '0.568', 'DAMAGECOUNT': '0.717', 'WEAPON3': '0.750', 'FRAGCOUNT': '1.000', 'weapon2': '1.264', 'weapon3': '1.820'} [2024-08-05 06:28:55,483][00034] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1138.5). Total num frames: 3104768. Throughput: 0: 291.0. Samples: 777835. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 06:28:55,485][00034] Avg episode reward: [(0, '-4.688')] [2024-08-05 06:28:56,581][00138] Updated weights for policy 0, policy_version 380 (0.0017) [2024-08-05 06:29:00,483][00034] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1138.5). Total num frames: 3112960. Throughput: 0: 291.0. Samples: 778691. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 06:29:00,484][00034] Avg episode reward: [(0, '-4.688')] [2024-08-05 06:29:05,483][00034] Fps is (10 sec: 1638.4, 60 sec: 1228.8, 300 sec: 1166.3). Total num frames: 3121152. Throughput: 0: 289.9. Samples: 780364. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 06:29:05,486][00034] Avg episode reward: [(0, '-4.688')] [2024-08-05 06:29:06,779][00139] DAMAGECOUNT value on done: 19439.0 [2024-08-05 06:29:07,008][00139] DAMAGECOUNT value on done: 18971.0 [2024-08-05 06:29:07,009][00139] Sum rewards: -0.325, reward structure: {'DEATHCOUNT': '-6.000', 'HEALTH': '-0.616', 'AMMO5': '0.010', 'AMMO2': '0.012', 'HITCOUNT': '0.050', 'weapon5': '0.060', 'AMMO4': '0.062', 'AMMO3': '0.101', 'WEAPON5': '0.150', 'WEAPON4': '0.150', 'weapon4': '0.186', 'ARMOR': '0.532', 'DAMAGECOUNT': '0.570', 'WEAPON3': '0.600', 'FRAGCOUNT': '1.000', 'weapon2': '1.320', 'weapon3': '1.488'} [2024-08-05 06:29:10,483][00034] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1138.5). Total num frames: 3121152. Throughput: 0: 289.6. Samples: 782121. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 06:29:10,484][00034] Avg episode reward: [(0, '-4.666')] [2024-08-05 06:29:15,483][00034] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1138.5). Total num frames: 3129344. Throughput: 0: 290.5. Samples: 782985. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 06:29:15,485][00034] Avg episode reward: [(0, '-4.666')] [2024-08-05 06:29:20,483][00034] Fps is (10 sec: 1638.4, 60 sec: 1228.8, 300 sec: 1166.3). Total num frames: 3137536. Throughput: 0: 290.1. Samples: 784689. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 06:29:20,485][00034] Avg episode reward: [(0, '-4.666')] [2024-08-05 06:29:21,309][00139] DAMAGECOUNT value on done: 19554.0 [2024-08-05 06:29:21,309][00139] Sum rewards: -8.055, reward structure: {'DEATHCOUNT': '-12.750', 'HEALTH': '-1.923', 'weapon5': '0.012', 'AMMO5': '0.013', 'AMMO2': '0.024', 'weapon4': '0.030', 'ARMOR': '0.076', 'WEAPON4': '0.100', 'AMMO4': '0.120', 'HITCOUNT': '0.120', 'AMMO3': '0.214', 'WEAPON5': '0.250', 'DAMAGECOUNT': '0.345', 'FRAGCOUNT': '1.000', 'WEAPON3': '1.150', 'weapon3': '1.238', 'weapon2': '1.926'} [2024-08-05 06:29:21,540][00139] DAMAGECOUNT value on done: 19243.0 [2024-08-05 06:29:25,483][00034] Fps is (10 sec: 1638.4, 60 sec: 1228.8, 300 sec: 1166.3). Total num frames: 3145728. Throughput: 0: 290.0. Samples: 786436. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 06:29:25,484][00034] Avg episode reward: [(0, '-4.693')] [2024-08-05 06:29:30,483][00034] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1138.5). Total num frames: 3145728. Throughput: 0: 289.5. Samples: 787281. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 06:29:30,486][00034] Avg episode reward: [(0, '-4.693')] [2024-08-05 06:29:35,483][00034] Fps is (10 sec: 819.2, 60 sec: 1228.8, 300 sec: 1138.5). Total num frames: 3153920. Throughput: 0: 286.6. Samples: 788965. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 06:29:35,484][00034] Avg episode reward: [(0, '-4.693')] [2024-08-05 06:29:35,494][00132] Saving /kaggle/working/train_dir/default_experiment/checkpoint_p0/checkpoint_000000385_3153920.pth... [2024-08-05 06:29:35,565][00132] Removing /kaggle/working/train_dir/default_experiment/checkpoint_p0/checkpoint_000000351_2875392.pth [2024-08-05 06:29:36,084][00139] DAMAGECOUNT value on done: 19792.0 [2024-08-05 06:29:36,085][00139] Sum rewards: -9.351, reward structure: {'DEATHCOUNT': '-12.000', 'HEALTH': '-2.684', 'FRAGCOUNT': '-0.500', 'AMMO2': '0.009', 'WEAPON1': '0.010', 'AMMO5': '0.011', 'AMMO4': '0.042', 'ARMOR': '0.085', 'weapon5': '0.094', 'HITCOUNT': '0.110', 'AMMO3': '0.242', 'WEAPON5': '0.250', 'DAMAGECOUNT': '0.714', 'weapon2': '1.012', 'WEAPON3': '1.200', 'weapon3': '2.054'} [2024-08-05 06:29:36,319][00139] DAMAGECOUNT value on done: 19393.0 [2024-08-05 06:29:36,319][00139] Sum rewards: -10.314, reward structure: {'DEATHCOUNT': '-11.250', 'HEALTH': '-2.724', 'FRAGCOUNT': '-1.500', 'AMMO2': '0.000', 'AMMO4': '0.000', 'AMMO5': '0.012', 'weapon5': '0.030', 'ARMOR': '0.056', 'HITCOUNT': '0.080', 'WEAPON4': '0.100', 'weapon4': '0.108', 'WEAPON5': '0.150', 'AMMO3': '0.237', 'DAMAGECOUNT': '0.450', 'WEAPON3': '1.050', 'weapon3': '1.354', 'weapon2': '1.532'} [2024-08-05 06:29:40,483][00034] Fps is (10 sec: 1638.4, 60 sec: 1228.8, 300 sec: 1166.3). Total num frames: 3162112. Throughput: 0: 285.3. Samples: 790673. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 06:29:40,484][00034] Avg episode reward: [(0, '-4.778')] [2024-08-05 06:29:45,483][00034] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1138.5). Total num frames: 3162112. Throughput: 0: 285.7. Samples: 791546. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 06:29:45,485][00034] Avg episode reward: [(0, '-4.778')] [2024-08-05 06:29:50,483][00034] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1138.5). Total num frames: 3170304. Throughput: 0: 286.7. Samples: 793266. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 06:29:50,484][00034] Avg episode reward: [(0, '-4.778')] [2024-08-05 06:29:50,727][00139] DAMAGECOUNT value on done: 20017.0 [2024-08-05 06:29:50,727][00139] Sum rewards: -3.547, reward structure: {'DEATHCOUNT': '-11.250', 'HEALTH': '-0.480', 'weapon5': '0.004', 'AMMO5': '0.010', 'weapon7': '0.010', 'ARMOR': '0.016', 'weapon4': '0.020', 'AMMO2': '0.025', 'AMMO4': '0.123', 'WEAPON4': '0.150', 'AMMO3': '0.192', 'HITCOUNT': '0.200', 'WEAPON5': '0.200', 'AMMO6': '0.200', 'WEAPON7': '0.200', 'AMMO7': '0.200', 'DAMAGECOUNT': '0.675', 'WEAPON3': '1.000', 'weapon2': '1.074', 'weapon3': '1.884', 'FRAGCOUNT': '2.000'} [2024-08-05 06:29:50,947][00139] DAMAGECOUNT value on done: 19932.0 [2024-08-05 06:29:50,948][00139] Sum rewards: -0.665, reward structure: {'DEATHCOUNT': '-12.000', 'HEALTH': '-1.654', 'ARMOR': '0.008', 'AMMO5': '0.010', 'AMMO2': '0.021', 'weapon4': '0.034', 'weapon5': '0.084', 'AMMO4': '0.104', 'WEAPON4': '0.150', 'AMMO3': '0.236', 'WEAPON5': '0.250', 'HITCOUNT': '0.330', 'WEAPON3': '1.050', 'weapon2': '1.488', 'weapon3': '1.606', 'DAMAGECOUNT': '1.617', 'FRAGCOUNT': '6.000'} [2024-08-05 06:29:55,483][00034] Fps is (10 sec: 1638.4, 60 sec: 1228.8, 300 sec: 1166.3). Total num frames: 3178496. Throughput: 0: 286.3. Samples: 795005. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 06:29:55,485][00034] Avg episode reward: [(0, '-4.699')] [2024-08-05 06:30:00,483][00034] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1138.5). Total num frames: 3178496. Throughput: 0: 286.4. Samples: 795873. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 06:30:00,484][00034] Avg episode reward: [(0, '-4.699')] [2024-08-05 06:30:05,483][00034] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1138.5). Total num frames: 3186688. Throughput: 0: 285.0. Samples: 797512. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 06:30:05,485][00034] Avg episode reward: [(0, '-4.699')] [2024-08-05 06:30:05,565][00139] DAMAGECOUNT value on done: 20347.0 [2024-08-05 06:30:05,565][00139] Sum rewards: -6.352, reward structure: {'DEATHCOUNT': '-9.000', 'FRAGCOUNT': '-2.500', 'HEALTH': '-0.460', 'AMMO5': '0.015', 'AMMO2': '0.034', 'ARMOR': '0.040', 'weapon5': '0.044', 'weapon4': '0.062', 'AMMO3': '0.101', 'AMMO4': '0.171', 'HITCOUNT': '0.220', 'WEAPON4': '0.250', 'WEAPON5': '0.250', 'WEAPON3': '0.500', 'DAMAGECOUNT': '0.990', 'weapon3': '1.134', 'weapon2': '1.796'} [2024-08-05 06:30:05,783][00139] DAMAGECOUNT value on done: 19956.0 [2024-08-05 06:30:08,285][00138] Updated weights for policy 0, policy_version 390 (0.0017) [2024-08-05 06:30:10,483][00034] Fps is (10 sec: 1638.4, 60 sec: 1228.8, 300 sec: 1138.5). Total num frames: 3194880. Throughput: 0: 284.6. Samples: 799244. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 06:30:10,485][00034] Avg episode reward: [(0, '-4.682')] [2024-08-05 06:30:15,483][00034] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1138.5). Total num frames: 3194880. Throughput: 0: 284.8. Samples: 800096. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 06:30:15,484][00034] Avg episode reward: [(0, '-4.682')] [2024-08-05 06:30:20,086][00139] DAMAGECOUNT value on done: 20456.0 [2024-08-05 06:30:20,087][00139] Sum rewards: -1.739, reward structure: {'DEATHCOUNT': '-7.500', 'HEALTH': '-1.558', 'AMMO2': '0.008', 'AMMO5': '0.015', 'ARMOR': '0.032', 'AMMO4': '0.037', 'weapon7': '0.050', 'HITCOUNT': '0.070', 'AMMO6': '0.100', 'AMMO7': '0.100', 'WEAPON7': '0.100', 'WEAPON4': '0.100', 'weapon5': '0.120', 'AMMO3': '0.144', 'weapon4': '0.264', 'DAMAGECOUNT': '0.327', 'WEAPON5': '0.350', 'WEAPON3': '0.850', 'weapon3': '1.082', 'weapon2': '1.570', 'FRAGCOUNT': '2.000'} [2024-08-05 06:30:20,302][00139] DAMAGECOUNT value on done: 20011.0 [2024-08-05 06:30:20,483][00034] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1138.5). Total num frames: 3203072. Throughput: 0: 286.0. Samples: 801833. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 06:30:20,485][00034] Avg episode reward: [(0, '-4.701')] [2024-08-05 06:30:25,483][00034] Fps is (10 sec: 1638.4, 60 sec: 1092.3, 300 sec: 1138.5). Total num frames: 3211264. Throughput: 0: 286.6. Samples: 803572. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 06:30:25,484][00034] Avg episode reward: [(0, '-4.701')] [2024-08-05 06:30:30,483][00034] Fps is (10 sec: 1638.4, 60 sec: 1228.8, 300 sec: 1166.3). Total num frames: 3219456. Throughput: 0: 286.4. Samples: 804434. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 06:30:30,484][00034] Avg episode reward: [(0, '-4.701')] [2024-08-05 06:30:34,825][00139] DAMAGECOUNT value on done: 20590.0 [2024-08-05 06:30:34,826][00139] Sum rewards: -5.656, reward structure: {'DEATHCOUNT': '-8.250', 'FRAGCOUNT': '-1.500', 'HEALTH': '-1.076', 'AMMO5': '0.008', 'WEAPON1': '0.010', 'weapon5': '0.022', 'AMMO2': '0.039', 'WEAPON5': '0.050', 'ARMOR': '0.060', 'AMMO3': '0.122', 'HITCOUNT': '0.130', 'AMMO4': '0.194', 'DAMAGECOUNT': '0.402', 'weapon4': '0.446', 'WEAPON4': '0.500', 'WEAPON3': '0.550', 'weapon3': '1.184', 'weapon2': '1.454'} [2024-08-05 06:30:35,057][00139] DAMAGECOUNT value on done: 20321.0 [2024-08-05 06:30:35,057][00139] Sum rewards: -5.921, reward structure: {'DEATHCOUNT': '-11.250', 'HEALTH': '-2.012', 'weapon5': '0.006', 'AMMO2': '0.007', 'AMMO5': '0.010', 'weapon4': '0.020', 'AMMO4': '0.036', 'ARMOR': '0.040', 'WEAPON4': '0.050', 'weapon7': '0.064', 'AMMO6': '0.100', 'AMMO7': '0.100', 'WEAPON7': '0.100', 'WEAPON5': '0.200', 'AMMO3': '0.253', 'HITCOUNT': '0.280', 'DAMAGECOUNT': '0.930', 'FRAGCOUNT': '1.000', 'WEAPON3': '1.150', 'weapon2': '1.352', 'weapon3': '1.642'} [2024-08-05 06:30:35,483][00034] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1138.5). Total num frames: 3219456. Throughput: 0: 286.1. Samples: 806141. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 06:30:35,484][00034] Avg episode reward: [(0, '-4.681')] [2024-08-05 06:30:40,483][00034] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1138.5). Total num frames: 3227648. Throughput: 0: 285.5. Samples: 807853. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 06:30:40,484][00034] Avg episode reward: [(0, '-4.681')] [2024-08-05 06:30:45,483][00034] Fps is (10 sec: 1638.4, 60 sec: 1228.8, 300 sec: 1166.3). Total num frames: 3235840. Throughput: 0: 285.9. Samples: 808738. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 06:30:45,485][00034] Avg episode reward: [(0, '-4.681')] [2024-08-05 06:30:49,329][00139] DAMAGECOUNT value on done: 20650.0 [2024-08-05 06:30:49,534][00139] DAMAGECOUNT value on done: 20481.0 [2024-08-05 06:30:49,535][00139] Sum rewards: -5.475, reward structure: {'DEATHCOUNT': '-9.750', 'HEALTH': '-2.072', 'weapon5': '0.012', 'AMMO5': '0.017', 'AMMO2': '0.023', 'weapon4': '0.096', 'AMMO4': '0.115', 'HITCOUNT': '0.180', 'AMMO3': '0.182', 'WEAPON4': '0.200', 'WEAPON5': '0.350', 'ARMOR': '0.444', 'DAMAGECOUNT': '0.480', 'FRAGCOUNT': '0.500', 'WEAPON3': '0.900', 'weapon2': '1.264', 'weapon3': '1.584'} [2024-08-05 06:30:50,483][00034] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1138.5). Total num frames: 3235840. Throughput: 0: 288.0. Samples: 810473. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 06:30:50,484][00034] Avg episode reward: [(0, '-4.666')] [2024-08-05 06:30:55,483][00034] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1138.5). Total num frames: 3244032. Throughput: 0: 287.6. Samples: 812187. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 06:30:55,484][00034] Avg episode reward: [(0, '-4.666')] [2024-08-05 06:31:00,483][00034] Fps is (10 sec: 1638.4, 60 sec: 1228.8, 300 sec: 1138.5). Total num frames: 3252224. Throughput: 0: 288.2. Samples: 813063. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 06:31:00,485][00034] Avg episode reward: [(0, '-4.666')] [2024-08-05 06:31:04,065][00139] DAMAGECOUNT value on done: 20964.0 [2024-08-05 06:31:04,065][00139] Sum rewards: -4.699, reward structure: {'DEATHCOUNT': '-12.000', 'HEALTH': '-1.621', 'AMMO4': '-0.037', 'AMMO2': '-0.007', 'AMMO5': '0.005', 'WEAPON1': '0.020', 'ARMOR': '0.032', 'WEAPON5': '0.100', 'AMMO3': '0.261', 'HITCOUNT': '0.300', 'DAMAGECOUNT': '0.942', 'WEAPON3': '1.100', 'weapon2': '1.394', 'weapon3': '1.812', 'FRAGCOUNT': '3.000'} [2024-08-05 06:31:04,290][00139] DAMAGECOUNT value on done: 20666.0 [2024-08-05 06:31:04,290][00139] Sum rewards: -5.301, reward structure: {'DEATHCOUNT': '-9.750', 'HEALTH': '-2.600', 'weapon5': '0.002', 'AMMO2': '0.009', 'AMMO5': '0.010', 'WEAPON1': '0.020', 'AMMO4': '0.043', 'WEAPON5': '0.100', 'AMMO3': '0.142', 'HITCOUNT': '0.150', 'weapon4': '0.160', 'WEAPON4': '0.200', 'DAMAGECOUNT': '0.555', 'WEAPON3': '0.850', 'weapon3': '1.306', 'weapon2': '1.502', 'FRAGCOUNT': '2.000'} [2024-08-05 06:31:05,483][00034] Fps is (10 sec: 1638.4, 60 sec: 1228.8, 300 sec: 1166.3). Total num frames: 3260416. Throughput: 0: 287.1. Samples: 814754. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 06:31:05,487][00034] Avg episode reward: [(0, '-4.676')] [2024-08-05 06:31:10,483][00034] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1138.5). Total num frames: 3260416. Throughput: 0: 286.3. Samples: 816454. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 06:31:10,485][00034] Avg episode reward: [(0, '-4.676')] [2024-08-05 06:31:15,483][00034] Fps is (10 sec: 819.2, 60 sec: 1228.8, 300 sec: 1138.5). Total num frames: 3268608. Throughput: 0: 286.4. Samples: 817324. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 06:31:15,485][00034] Avg episode reward: [(0, '-4.676')] [2024-08-05 06:31:18,633][00139] DAMAGECOUNT value on done: 21278.0 [2024-08-05 06:31:18,634][00139] Sum rewards: -1.847, reward structure: {'DEATHCOUNT': '-11.250', 'HEALTH': '-0.123', 'AMMO5': '0.003', 'WEAPON1': '0.010', 'weapon5': '0.012', 'AMMO2': '0.017', 'weapon4': '0.034', 'WEAPON5': '0.050', 'AMMO4': '0.082', 'WEAPON4': '0.100', 'AMMO3': '0.158', 'HITCOUNT': '0.250', 'WEAPON3': '0.750', 'DAMAGECOUNT': '0.942', 'weapon3': '1.326', 'weapon2': '1.792', 'FRAGCOUNT': '4.000'} [2024-08-05 06:31:18,891][00139] DAMAGECOUNT value on done: 20866.0 [2024-08-05 06:31:19,671][00138] Updated weights for policy 0, policy_version 400 (0.0017) [2024-08-05 06:31:20,483][00034] Fps is (10 sec: 1638.4, 60 sec: 1228.8, 300 sec: 1166.3). Total num frames: 3276800. Throughput: 0: 287.2. Samples: 819064. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 06:31:20,484][00034] Avg episode reward: [(0, '-4.612')] [2024-08-05 06:31:20,488][00132] Saving new best policy, reward=-4.612! [2024-08-05 06:31:25,483][00034] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1138.5). Total num frames: 3276800. Throughput: 0: 287.7. Samples: 820799. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 06:31:25,485][00034] Avg episode reward: [(0, '-4.612')] [2024-08-05 06:31:30,483][00034] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1138.5). Total num frames: 3284992. Throughput: 0: 287.3. Samples: 821666. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 06:31:30,484][00034] Avg episode reward: [(0, '-4.612')] [2024-08-05 06:31:33,127][00139] DAMAGECOUNT value on done: 21448.0 [2024-08-05 06:31:33,128][00139] Sum rewards: -4.047, reward structure: {'DEATHCOUNT': '-9.000', 'HEALTH': '-0.900', 'AMMO5': '0.010', 'weapon5': '0.010', 'WEAPON1': '0.020', 'AMMO2': '0.025', 'ARMOR': '0.040', 'AMMO4': '0.124', 'HITCOUNT': '0.130', 'AMMO3': '0.132', 'weapon4': '0.162', 'WEAPON5': '0.200', 'WEAPON4': '0.200', 'DAMAGECOUNT': '0.510', 'WEAPON3': '0.650', 'weapon2': '0.918', 'FRAGCOUNT': '1.000', 'weapon3': '1.722'} [2024-08-05 06:31:33,368][00139] DAMAGECOUNT value on done: 21222.0 [2024-08-05 06:31:33,369][00139] Sum rewards: -1.907, reward structure: {'DEATHCOUNT': '-8.250', 'HEALTH': '-1.211', 'AMMO5': '0.008', 'WEAPON1': '0.010', 'AMMO2': '0.013', 'weapon5': '0.030', 'ARMOR': '0.044', 'weapon7': '0.058', 'AMMO4': '0.067', 'AMMO6': '0.100', 'WEAPON7': '0.100', 'AMMO7': '0.100', 'AMMO3': '0.106', 'WEAPON5': '0.150', 'HITCOUNT': '0.160', 'WEAPON4': '0.200', 'weapon4': '0.404', 'WEAPON3': '0.450', 'weapon3': '0.778', 'DAMAGECOUNT': '1.068', 'weapon2': '1.708', 'FRAGCOUNT': '2.000'} [2024-08-05 06:31:35,483][00034] Fps is (10 sec: 1638.4, 60 sec: 1228.8, 300 sec: 1166.3). Total num frames: 3293184. Throughput: 0: 287.2. Samples: 823399. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 06:31:35,484][00034] Avg episode reward: [(0, '-4.585')] [2024-08-05 06:31:35,493][00132] Saving /kaggle/working/train_dir/default_experiment/checkpoint_p0/checkpoint_000000402_3293184.pth... [2024-08-05 06:31:35,575][00132] Removing /kaggle/working/train_dir/default_experiment/checkpoint_p0/checkpoint_000000368_3014656.pth [2024-08-05 06:31:35,585][00132] Saving new best policy, reward=-4.585! [2024-08-05 06:31:40,483][00034] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1138.5). Total num frames: 3293184. Throughput: 0: 286.5. Samples: 825079. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 06:31:40,484][00034] Avg episode reward: [(0, '-4.585')] [2024-08-05 06:31:45,483][00034] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1138.5). Total num frames: 3301376. Throughput: 0: 286.3. Samples: 825947. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 06:31:45,485][00034] Avg episode reward: [(0, '-4.585')] [2024-08-05 06:31:47,875][00139] DAMAGECOUNT value on done: 21532.0 [2024-08-05 06:31:47,875][00139] Sum rewards: -7.810, reward structure: {'DEATHCOUNT': '-9.000', 'HEALTH': '-2.460', 'FRAGCOUNT': '-1.500', 'ARMOR': '0.016', 'AMMO5': '0.019', 'weapon5': '0.020', 'AMMO2': '0.020', 'weapon7': '0.078', 'AMMO4': '0.100', 'HITCOUNT': '0.100', 'AMMO3': '0.114', 'AMMO6': '0.120', 'AMMO7': '0.120', 'WEAPON7': '0.200', 'DAMAGECOUNT': '0.252', 'WEAPON4': '0.300', 'WEAPON5': '0.300', 'weapon4': '0.300', 'WEAPON3': '0.650', 'weapon3': '0.922', 'weapon2': '1.518'} [2024-08-05 06:31:48,145][00139] DAMAGECOUNT value on done: 21334.0 [2024-08-05 06:31:48,146][00139] Sum rewards: -2.223, reward structure: {'DEATHCOUNT': '-7.500', 'HEALTH': '-1.457', 'weapon5': '0.006', 'AMMO5': '0.010', 'AMMO2': '0.019', 'ARMOR': '0.076', 'AMMO4': '0.096', 'weapon4': '0.108', 'AMMO3': '0.109', 'HITCOUNT': '0.120', 'WEAPON4': '0.150', 'WEAPON5': '0.200', 'DAMAGECOUNT': '0.336', 'WEAPON3': '0.650', 'weapon3': '1.410', 'weapon2': '1.444', 'FRAGCOUNT': '2.000'} [2024-08-05 06:31:50,483][00034] Fps is (10 sec: 1638.4, 60 sec: 1228.8, 300 sec: 1166.3). Total num frames: 3309568. Throughput: 0: 287.0. Samples: 827671. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 06:31:50,484][00034] Avg episode reward: [(0, '-4.623')] [2024-08-05 06:31:55,483][00034] Fps is (10 sec: 1638.4, 60 sec: 1228.8, 300 sec: 1166.3). Total num frames: 3317760. Throughput: 0: 288.2. Samples: 829424. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 06:31:55,485][00034] Avg episode reward: [(0, '-4.623')] [2024-08-05 06:32:00,483][00034] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1138.5). Total num frames: 3317760. Throughput: 0: 288.2. Samples: 830295. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 06:32:00,485][00034] Avg episode reward: [(0, '-4.623')] [2024-08-05 06:32:02,480][00139] DAMAGECOUNT value on done: 21584.0 [2024-08-05 06:32:02,695][00139] DAMAGECOUNT value on done: 21464.0 [2024-08-05 06:32:02,695][00139] Sum rewards: -3.791, reward structure: {'DEATHCOUNT': '-9.000', 'HEALTH': '-0.496', 'AMMO5': '0.010', 'AMMO2': '0.027', 'ARMOR': '0.064', 'weapon4': '0.064', 'AMMO3': '0.102', 'HITCOUNT': '0.110', 'AMMO4': '0.134', 'WEAPON5': '0.150', 'WEAPON4': '0.250', 'DAMAGECOUNT': '0.390', 'WEAPON3': '0.600', 'FRAGCOUNT': '1.000', 'weapon2': '1.264', 'weapon3': '1.540'} [2024-08-05 06:32:05,483][00034] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1138.5). Total num frames: 3325952. Throughput: 0: 287.9. Samples: 832019. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 06:32:05,484][00034] Avg episode reward: [(0, '-4.565')] [2024-08-05 06:32:05,492][00132] Saving new best policy, reward=-4.565! [2024-08-05 06:32:10,483][00034] Fps is (10 sec: 1638.4, 60 sec: 1228.8, 300 sec: 1166.3). Total num frames: 3334144. Throughput: 0: 286.6. Samples: 833694. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 06:32:10,484][00034] Avg episode reward: [(0, '-4.565')] [2024-08-05 06:32:15,483][00034] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1138.5). Total num frames: 3334144. Throughput: 0: 286.7. Samples: 834567. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 06:32:15,484][00034] Avg episode reward: [(0, '-4.565')] [2024-08-05 06:32:17,122][00139] DAMAGECOUNT value on done: 21606.0 [2024-08-05 06:32:17,123][00139] Sum rewards: -3.270, reward structure: {'DEATHCOUNT': '-7.500', 'HEALTH': '-1.128', 'AMMO2': '0.030', 'HITCOUNT': '0.030', 'weapon4': '0.046', 'DAMAGECOUNT': '0.066', 'ARMOR': '0.088', 'AMMO3': '0.126', 'AMMO4': '0.148', 'WEAPON4': '0.200', 'WEAPON3': '0.750', 'FRAGCOUNT': '1.000', 'weapon2': '1.216', 'weapon3': '1.658'} [2024-08-05 06:32:17,328][00139] DAMAGECOUNT value on done: 21610.0 [2024-08-05 06:32:17,329][00139] Sum rewards: -5.451, reward structure: {'DEATHCOUNT': '-9.750', 'HEALTH': '-1.568', 'weapon5': '0.002', 'AMMO2': '0.012', 'weapon4': '0.016', 'AMMO5': '0.018', 'AMMO4': '0.059', 'ARMOR': '0.080', 'HITCOUNT': '0.110', 'AMMO3': '0.141', 'WEAPON4': '0.150', 'WEAPON5': '0.250', 'DAMAGECOUNT': '0.438', 'WEAPON3': '0.800', 'FRAGCOUNT': '1.000', 'weapon2': '1.344', 'weapon3': '1.448'} [2024-08-05 06:32:20,483][00034] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1138.5). Total num frames: 3342336. Throughput: 0: 287.5. Samples: 836337. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 06:32:20,484][00034] Avg episode reward: [(0, '-4.535')] [2024-08-05 06:32:20,488][00132] Saving new best policy, reward=-4.535! [2024-08-05 06:32:25,483][00034] Fps is (10 sec: 1638.4, 60 sec: 1228.8, 300 sec: 1166.3). Total num frames: 3350528. Throughput: 0: 289.4. Samples: 838104. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 06:32:25,484][00034] Avg episode reward: [(0, '-4.535')] [2024-08-05 06:32:30,483][00034] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1138.5). Total num frames: 3350528. Throughput: 0: 289.5. Samples: 838974. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 06:32:30,485][00034] Avg episode reward: [(0, '-4.535')] [2024-08-05 06:32:30,633][00138] Updated weights for policy 0, policy_version 410 (0.0017) [2024-08-05 06:32:31,499][00139] DAMAGECOUNT value on done: 21741.0 [2024-08-05 06:32:31,712][00139] DAMAGECOUNT value on done: 21735.0 [2024-08-05 06:32:35,483][00034] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1138.5). Total num frames: 3358720. Throughput: 0: 289.9. Samples: 840715. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 06:32:35,485][00034] Avg episode reward: [(0, '-4.531')] [2024-08-05 06:32:35,492][00132] Saving new best policy, reward=-4.531! [2024-08-05 06:32:40,483][00034] Fps is (10 sec: 1638.4, 60 sec: 1228.8, 300 sec: 1166.3). Total num frames: 3366912. Throughput: 0: 290.0. Samples: 842472. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 06:32:40,486][00034] Avg episode reward: [(0, '-4.531')] [2024-08-05 06:32:45,483][00034] Fps is (10 sec: 1638.4, 60 sec: 1228.8, 300 sec: 1166.3). Total num frames: 3375104. Throughput: 0: 289.1. Samples: 843304. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 06:32:45,484][00034] Avg episode reward: [(0, '-4.531')] [2024-08-05 06:32:46,030][00139] DAMAGECOUNT value on done: 21851.0 [2024-08-05 06:32:46,031][00139] Sum rewards: -7.268, reward structure: {'DEATHCOUNT': '-10.500', 'HEALTH': '-3.177', 'AMMO2': '0.011', 'weapon5': '0.012', 'WEAPON1': '0.020', 'AMMO5': '0.025', 'ARMOR': '0.044', 'AMMO4': '0.054', 'weapon4': '0.054', 'HITCOUNT': '0.100', 'WEAPON4': '0.150', 'AMMO3': '0.211', 'DAMAGECOUNT': '0.330', 'WEAPON5': '0.400', 'weapon2': '0.866', 'FRAGCOUNT': '1.000', 'WEAPON3': '1.200', 'weapon3': '1.932'} [2024-08-05 06:32:46,263][00139] DAMAGECOUNT value on done: 21845.0 [2024-08-05 06:32:46,264][00139] Sum rewards: -1.437, reward structure: {'DEATHCOUNT': '-6.000', 'HEALTH': '-1.046', 'AMMO4': '-0.021', 'AMMO2': '-0.004', 'AMMO5': '0.013', 'ARMOR': '0.028', 'HITCOUNT': '0.060', 'weapon5': '0.068', 'AMMO3': '0.080', 'weapon7': '0.084', 'AMMO6': '0.100', 'AMMO7': '0.100', 'WEAPON7': '0.100', 'WEAPON5': '0.300', 'DAMAGECOUNT': '0.330', 'WEAPON3': '0.550', 'FRAGCOUNT': '1.000', 'weapon2': '1.096', 'weapon3': '1.724'} [2024-08-05 06:32:50,483][00034] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1138.5). Total num frames: 3375104. Throughput: 0: 289.4. Samples: 845040. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 06:32:50,484][00034] Avg episode reward: [(0, '-4.579')] [2024-08-05 06:32:55,483][00034] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1138.5). Total num frames: 3383296. Throughput: 0: 290.2. Samples: 846752. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 06:32:55,484][00034] Avg episode reward: [(0, '-4.579')] [2024-08-05 06:33:00,483][00034] Fps is (10 sec: 1638.4, 60 sec: 1228.8, 300 sec: 1166.3). Total num frames: 3391488. Throughput: 0: 290.0. Samples: 847617. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 06:33:00,485][00034] Avg episode reward: [(0, '-4.579')] [2024-08-05 06:33:00,614][00139] DAMAGECOUNT value on done: 21946.0 [2024-08-05 06:33:00,831][00139] DAMAGECOUNT value on done: 22180.0 [2024-08-05 06:33:00,831][00139] Sum rewards: -6.345, reward structure: {'DEATHCOUNT': '-11.250', 'FRAGCOUNT': '-0.500', 'HEALTH': '-0.268', 'AMMO5': '0.017', 'AMMO2': '0.018', 'weapon4': '0.046', 'weapon5': '0.060', 'AMMO4': '0.089', 'WEAPON4': '0.100', 'AMMO3': '0.144', 'WEAPON5': '0.300', 'HITCOUNT': '0.320', 'WEAPON3': '0.800', 'DAMAGECOUNT': '1.005', 'weapon2': '1.380', 'weapon3': '1.394'} [2024-08-05 06:33:05,483][00034] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1138.5). Total num frames: 3391488. Throughput: 0: 289.2. Samples: 849353. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 06:33:05,484][00034] Avg episode reward: [(0, '-4.595')] [2024-08-05 06:33:10,483][00034] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1138.5). Total num frames: 3399680. Throughput: 0: 288.8. Samples: 851099. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 06:33:10,484][00034] Avg episode reward: [(0, '-4.595')] [2024-08-05 06:33:15,259][00139] DAMAGECOUNT value on done: 22371.0 [2024-08-05 06:33:15,260][00139] Sum rewards: -3.981, reward structure: {'DEATHCOUNT': '-9.750', 'HEALTH': '-2.000', 'AMMO2': '0.005', 'weapon4': '0.008', 'WEAPON1': '0.010', 'AMMO5': '0.023', 'AMMO4': '0.024', 'ARMOR': '0.040', 'WEAPON4': '0.100', 'weapon5': '0.146', 'AMMO3': '0.166', 'HITCOUNT': '0.230', 'WEAPON5': '0.400', 'WEAPON3': '1.000', 'weapon2': '1.230', 'DAMAGECOUNT': '1.275', 'FRAGCOUNT': '1.500', 'weapon3': '1.612'} [2024-08-05 06:33:15,483][00034] Fps is (10 sec: 1638.4, 60 sec: 1228.8, 300 sec: 1166.3). Total num frames: 3407872. Throughput: 0: 287.6. Samples: 851914. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 06:33:15,484][00034] Avg episode reward: [(0, '-4.576')] [2024-08-05 06:33:15,487][00139] DAMAGECOUNT value on done: 22280.0 [2024-08-05 06:33:15,488][00139] Sum rewards: -7.813, reward structure: {'DEATHCOUNT': '-10.500', 'HEALTH': '-1.643', 'FRAGCOUNT': '-0.500', 'AMMO2': '0.012', 'ARMOR': '0.020', 'AMMO5': '0.023', 'weapon4': '0.034', 'weapon5': '0.056', 'AMMO4': '0.058', 'HITCOUNT': '0.120', 'AMMO3': '0.157', 'WEAPON4': '0.200', 'WEAPON5': '0.250', 'DAMAGECOUNT': '0.300', 'WEAPON3': '0.750', 'weapon3': '1.402', 'weapon2': '1.448'} [2024-08-05 06:33:20,483][00034] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1138.5). Total num frames: 3407872. Throughput: 0: 287.2. Samples: 853637. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 06:33:20,484][00034] Avg episode reward: [(0, '-4.628')] [2024-08-05 06:33:25,483][00034] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1138.5). Total num frames: 3416064. Throughput: 0: 287.6. Samples: 855415. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 06:33:25,484][00034] Avg episode reward: [(0, '-4.628')] [2024-08-05 06:33:29,600][00139] DAMAGECOUNT value on done: 22481.0 [2024-08-05 06:33:29,826][00139] DAMAGECOUNT value on done: 22345.0 [2024-08-05 06:33:29,827][00139] Sum rewards: -5.560, reward structure: {'DEATHCOUNT': '-10.500', 'HEALTH': '-1.458', 'AMMO4': '-0.000', 'AMMO2': '-0.000', 'AMMO5': '0.005', 'weapon5': '0.060', 'HITCOUNT': '0.070', 'ARMOR': '0.098', 'WEAPON5': '0.100', 'AMMO3': '0.158', 'DAMAGECOUNT': '0.195', 'WEAPON3': '0.900', 'weapon3': '1.356', 'weapon2': '1.456', 'FRAGCOUNT': '2.000'} [2024-08-05 06:33:30,483][00034] Fps is (10 sec: 1638.4, 60 sec: 1228.8, 300 sec: 1166.3). Total num frames: 3424256. Throughput: 0: 288.5. Samples: 856287. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 06:33:30,485][00034] Avg episode reward: [(0, '-4.665')] [2024-08-05 06:33:35,483][00034] Fps is (10 sec: 1638.4, 60 sec: 1228.8, 300 sec: 1166.3). Total num frames: 3432448. Throughput: 0: 289.2. Samples: 858056. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 06:33:35,484][00034] Avg episode reward: [(0, '-4.665')] [2024-08-05 06:33:35,493][00132] Saving /kaggle/working/train_dir/default_experiment/checkpoint_p0/checkpoint_000000419_3432448.pth... [2024-08-05 06:33:35,572][00132] Removing /kaggle/working/train_dir/default_experiment/checkpoint_p0/checkpoint_000000385_3153920.pth [2024-08-05 06:33:40,483][00034] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1138.5). Total num frames: 3432448. Throughput: 0: 289.7. Samples: 859788. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 06:33:40,485][00034] Avg episode reward: [(0, '-4.665')] [2024-08-05 06:33:41,474][00138] Updated weights for policy 0, policy_version 420 (0.0018) [2024-08-05 06:33:44,277][00139] DAMAGECOUNT value on done: 22699.0 [2024-08-05 06:33:44,278][00139] Sum rewards: -2.972, reward structure: {'DEATHCOUNT': '-7.500', 'FRAGCOUNT': '-0.500', 'HEALTH': '-0.206', 'AMMO2': '0.008', 'AMMO5': '0.019', 'weapon5': '0.034', 'AMMO4': '0.038', 'WEAPON4': '0.050', 'weapon4': '0.054', 'AMMO3': '0.116', 'HITCOUNT': '0.180', 'WEAPON5': '0.350', 'ARMOR': '0.404', 'WEAPON3': '0.550', 'DAMAGECOUNT': '0.654', 'weapon2': '1.102', 'weapon3': '1.674'} [2024-08-05 06:33:44,516][00139] DAMAGECOUNT value on done: 22710.0 [2024-08-05 06:33:44,516][00139] Sum rewards: -2.166, reward structure: {'DEATHCOUNT': '-9.000', 'HEALTH': '-1.230', 'weapon5': '0.002', 'AMMO5': '0.007', 'weapon4': '0.016', 'AMMO2': '0.023', 'WEAPON5': '0.050', 'AMMO4': '0.114', 'ARMOR': '0.128', 'AMMO3': '0.147', 'WEAPON4': '0.250', 'HITCOUNT': '0.280', 'WEAPON3': '0.900', 'weapon2': '0.942', 'DAMAGECOUNT': '1.095', 'FRAGCOUNT': '2.000', 'weapon3': '2.110'} [2024-08-05 06:33:45,483][00034] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1138.5). Total num frames: 3440640. Throughput: 0: 289.4. Samples: 860642. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 06:33:45,485][00034] Avg episode reward: [(0, '-4.605')] [2024-08-05 06:33:50,483][00034] Fps is (10 sec: 1638.4, 60 sec: 1228.8, 300 sec: 1166.3). Total num frames: 3448832. Throughput: 0: 288.5. Samples: 862337. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 06:33:50,485][00034] Avg episode reward: [(0, '-4.605')] [2024-08-05 06:33:55,483][00034] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1138.5). Total num frames: 3448832. Throughput: 0: 289.9. Samples: 864146. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 06:33:55,485][00034] Avg episode reward: [(0, '-4.605')] [2024-08-05 06:33:58,480][00139] DAMAGECOUNT value on done: 22794.0 [2024-08-05 06:33:58,481][00139] Sum rewards: -6.946, reward structure: {'DEATHCOUNT': '-11.250', 'HEALTH': '-1.338', 'AMMO5': '0.005', 'WEAPON1': '0.010', 'AMMO2': '0.022', 'weapon5': '0.030', 'weapon4': '0.034', 'ARMOR': '0.072', 'HITCOUNT': '0.100', 'WEAPON5': '0.100', 'AMMO4': '0.110', 'WEAPON4': '0.150', 'AMMO3': '0.176', 'DAMAGECOUNT': '0.285', 'WEAPON3': '0.950', 'FRAGCOUNT': '1.000', 'weapon3': '1.188', 'weapon2': '1.410'} [2024-08-05 06:33:58,712][00139] DAMAGECOUNT value on done: 22956.0 [2024-08-05 06:33:58,713][00139] Sum rewards: -3.200, reward structure: {'DEATHCOUNT': '-9.000', 'HEALTH': '-1.551', 'AMMO4': '-0.006', 'AMMO2': '-0.001', 'AMMO5': '0.012', 'weapon5': '0.036', 'WEAPON5': '0.050', 'weapon7': '0.072', 'AMMO6': '0.100', 'AMMO7': '0.100', 'WEAPON7': '0.100', 'HITCOUNT': '0.150', 'AMMO3': '0.178', 'DAMAGECOUNT': '0.738', 'weapon2': '0.840', 'WEAPON3': '0.950', 'FRAGCOUNT': '2.000', 'weapon3': '2.032'} [2024-08-05 06:34:00,483][00034] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1138.5). Total num frames: 3457024. Throughput: 0: 291.6. Samples: 865034. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 06:34:00,484][00034] Avg episode reward: [(0, '-4.590')] [2024-08-05 06:34:05,483][00034] Fps is (10 sec: 1638.4, 60 sec: 1228.8, 300 sec: 1166.3). Total num frames: 3465216. Throughput: 0: 291.4. Samples: 866749. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 06:34:05,485][00034] Avg episode reward: [(0, '-4.590')] [2024-08-05 06:34:10,483][00034] Fps is (10 sec: 1638.4, 60 sec: 1228.8, 300 sec: 1166.3). Total num frames: 3473408. Throughput: 0: 290.5. Samples: 868489. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 06:34:10,484][00034] Avg episode reward: [(0, '-4.590')] [2024-08-05 06:34:12,896][00139] DAMAGECOUNT value on done: 22859.0 [2024-08-05 06:34:13,127][00139] DAMAGECOUNT value on done: 23080.0 [2024-08-05 06:34:13,128][00139] Sum rewards: -3.113, reward structure: {'DEATHCOUNT': '-7.500', 'HEALTH': '-0.784', 'ARMOR': '0.008', 'WEAPON1': '0.010', 'AMMO5': '0.012', 'AMMO2': '0.015', 'weapon5': '0.054', 'AMMO4': '0.074', 'HITCOUNT': '0.090', 'AMMO3': '0.092', 'WEAPON5': '0.200', 'DAMAGECOUNT': '0.372', 'WEAPON3': '0.550', 'FRAGCOUNT': '1.000', 'weapon2': '1.166', 'weapon3': '1.528'} [2024-08-05 06:34:15,483][00034] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1138.5). Total num frames: 3473408. Throughput: 0: 291.4. Samples: 869400. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 06:34:15,487][00034] Avg episode reward: [(0, '-4.581')] [2024-08-05 06:34:20,483][00034] Fps is (10 sec: 819.2, 60 sec: 1228.8, 300 sec: 1138.5). Total num frames: 3481600. Throughput: 0: 289.6. Samples: 871086. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 06:34:20,484][00034] Avg episode reward: [(0, '-4.581')] [2024-08-05 06:34:25,483][00034] Fps is (10 sec: 1638.4, 60 sec: 1228.8, 300 sec: 1166.3). Total num frames: 3489792. Throughput: 0: 290.7. Samples: 872871. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 06:34:25,485][00034] Avg episode reward: [(0, '-4.581')] [2024-08-05 06:34:27,385][00139] DAMAGECOUNT value on done: 23279.0 [2024-08-05 06:34:27,385][00139] Sum rewards: -7.180, reward structure: {'DEATHCOUNT': '-10.500', 'HEALTH': '-2.635', 'FRAGCOUNT': '-0.500', 'AMMO2': '0.012', 'AMMO5': '0.017', 'weapon5': '0.036', 'AMMO4': '0.058', 'ARMOR': '0.060', 'weapon4': '0.074', 'AMMO3': '0.182', 'WEAPON4': '0.200', 'HITCOUNT': '0.330', 'WEAPON5': '0.350', 'WEAPON3': '0.950', 'weapon2': '1.170', 'DAMAGECOUNT': '1.260', 'weapon3': '1.756'} [2024-08-05 06:34:27,593][00139] DAMAGECOUNT value on done: 23249.0 [2024-08-05 06:34:27,593][00139] Sum rewards: 3.469, reward structure: {'DEATHCOUNT': '-3.000', 'HEALTH': '-0.103', 'AMMO5': '0.005', 'AMMO2': '0.016', 'weapon4': '0.048', 'weapon5': '0.064', 'weapon7': '0.066', 'AMMO3': '0.072', 'AMMO4': '0.080', 'AMMO6': '0.100', 'AMMO7': '0.100', 'WEAPON7': '0.100', 'WEAPON4': '0.100', 'WEAPON5': '0.100', 'HITCOUNT': '0.130', 'WEAPON3': '0.250', 'DAMAGECOUNT': '0.507', 'ARMOR': '0.528', 'weapon2': '1.356', 'weapon3': '1.450', 'FRAGCOUNT': '1.500'} [2024-08-05 06:34:30,483][00034] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1138.5). Total num frames: 3489792. Throughput: 0: 291.7. Samples: 873769. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 06:34:30,484][00034] Avg episode reward: [(0, '-4.578')] [2024-08-05 06:34:35,483][00034] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1138.5). Total num frames: 3497984. Throughput: 0: 293.9. Samples: 875564. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 06:34:35,484][00034] Avg episode reward: [(0, '-4.578')] [2024-08-05 06:34:40,483][00034] Fps is (10 sec: 1638.4, 60 sec: 1228.8, 300 sec: 1166.3). Total num frames: 3506176. Throughput: 0: 293.1. Samples: 877334. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 06:34:40,484][00034] Avg episode reward: [(0, '-4.578')] [2024-08-05 06:34:41,539][00139] DAMAGECOUNT value on done: 23484.0 [2024-08-05 06:34:41,540][00139] Sum rewards: -4.991, reward structure: {'DEATHCOUNT': '-9.000', 'HEALTH': '-0.528', 'FRAGCOUNT': '-0.500', 'AMMO5': '0.008', 'WEAPON1': '0.010', 'AMMO2': '0.018', 'weapon5': '0.022', 'weapon4': '0.078', 'AMMO4': '0.091', 'WEAPON4': '0.100', 'AMMO3': '0.143', 'HITCOUNT': '0.150', 'WEAPON5': '0.150', 'ARMOR': '0.152', 'DAMAGECOUNT': '0.615', 'WEAPON3': '0.700', 'weapon3': '1.170', 'weapon2': '1.630'} [2024-08-05 06:34:41,791][00139] DAMAGECOUNT value on done: 23294.0 [2024-08-05 06:34:41,791][00139] Sum rewards: -5.479, reward structure: {'DEATHCOUNT': '-8.250', 'FRAGCOUNT': '-1.500', 'AMMO5': '0.003', 'HEALTH': '0.008', 'WEAPON1': '0.010', 'AMMO2': '0.025', 'WEAPON5': '0.050', 'HITCOUNT': '0.050', 'weapon5': '0.056', 'ARMOR': '0.081', 'AMMO3': '0.089', 'weapon4': '0.108', 'AMMO4': '0.122', 'DAMAGECOUNT': '0.135', 'WEAPON4': '0.250', 'WEAPON3': '0.500', 'weapon2': '1.224', 'weapon3': '1.560'} [2024-08-05 06:34:45,483][00034] Fps is (10 sec: 1638.4, 60 sec: 1228.8, 300 sec: 1166.3). Total num frames: 3514368. Throughput: 0: 292.1. Samples: 878180. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 06:34:45,484][00034] Avg episode reward: [(0, '-4.619')] [2024-08-05 06:34:50,483][00034] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1138.6). Total num frames: 3514368. Throughput: 0: 291.3. Samples: 879859. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 06:34:50,484][00034] Avg episode reward: [(0, '-4.619')] [2024-08-05 06:34:51,870][00138] Updated weights for policy 0, policy_version 430 (0.0019) [2024-08-05 06:34:55,483][00034] Fps is (10 sec: 819.2, 60 sec: 1228.8, 300 sec: 1166.3). Total num frames: 3522560. Throughput: 0: 291.5. Samples: 881605. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 06:34:55,485][00034] Avg episode reward: [(0, '-4.619')] [2024-08-05 06:34:56,219][00139] DAMAGECOUNT value on done: 23582.0 [2024-08-05 06:34:56,220][00139] Sum rewards: -7.929, reward structure: {'DEATHCOUNT': '-9.750', 'HEALTH': '-1.716', 'FRAGCOUNT': '-1.500', 'AMMO5': '0.005', 'AMMO2': '0.025', 'weapon4': '0.038', 'WEAPON1': '0.060', 'weapon5': '0.062', 'ARMOR': '0.080', 'WEAPON5': '0.100', 'HITCOUNT': '0.110', 'AMMO4': '0.123', 'AMMO3': '0.168', 'WEAPON4': '0.200', 'DAMAGECOUNT': '0.294', 'WEAPON3': '0.900', 'weapon3': '1.376', 'weapon2': '1.496'} [2024-08-05 06:34:56,466][00139] DAMAGECOUNT value on done: 23393.0 [2024-08-05 06:34:56,466][00139] Sum rewards: -7.403, reward structure: {'DEATHCOUNT': '-9.750', 'FRAGCOUNT': '-1.500', 'HEALTH': '-0.964', 'AMMO5': '0.010', 'WEAPON1': '0.010', 'AMMO2': '0.011', 'weapon4': '0.016', 'weapon5': '0.032', 'AMMO4': '0.057', 'ARMOR': '0.064', 'HITCOUNT': '0.100', 'WEAPON4': '0.100', 'WEAPON5': '0.150', 'AMMO3': '0.152', 'DAMAGECOUNT': '0.297', 'WEAPON3': '0.850', 'weapon2': '1.306', 'weapon3': '1.656'} [2024-08-05 06:35:00,483][00034] Fps is (10 sec: 1638.4, 60 sec: 1228.8, 300 sec: 1166.3). Total num frames: 3530752. Throughput: 0: 291.5. Samples: 882516. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 06:35:00,485][00034] Avg episode reward: [(0, '-4.761')] [2024-08-05 06:35:05,483][00034] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1138.5). Total num frames: 3530752. Throughput: 0: 293.6. Samples: 884299. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 06:35:05,484][00034] Avg episode reward: [(0, '-4.761')] [2024-08-05 06:35:10,483][00034] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1166.3). Total num frames: 3538944. Throughput: 0: 292.3. Samples: 886025. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 06:35:10,484][00034] Avg episode reward: [(0, '-4.761')] [2024-08-05 06:35:10,520][00139] DAMAGECOUNT value on done: 23713.0 [2024-08-05 06:35:10,743][00139] DAMAGECOUNT value on done: 23647.0 [2024-08-05 06:35:10,743][00139] Sum rewards: -3.148, reward structure: {'DEATHCOUNT': '-9.000', 'HEALTH': '-0.643', 'weapon5': '0.002', 'AMMO5': '0.010', 'AMMO2': '0.013', 'AMMO4': '0.063', 'WEAPON5': '0.100', 'ARMOR': '0.104', 'AMMO3': '0.149', 'WEAPON4': '0.150', 'weapon4': '0.202', 'HITCOUNT': '0.220', 'DAMAGECOUNT': '0.762', 'WEAPON3': '0.850', 'FRAGCOUNT': '1.000', 'weapon2': '1.238', 'weapon3': '1.632'} [2024-08-05 06:35:15,483][00034] Fps is (10 sec: 1638.4, 60 sec: 1228.8, 300 sec: 1166.3). Total num frames: 3547136. Throughput: 0: 291.5. Samples: 886886. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 06:35:15,485][00034] Avg episode reward: [(0, '-4.732')] [2024-08-05 06:35:20,483][00034] Fps is (10 sec: 1638.4, 60 sec: 1228.8, 300 sec: 1166.3). Total num frames: 3555328. Throughput: 0: 288.5. Samples: 888548. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 06:35:20,484][00034] Avg episode reward: [(0, '-4.732')] [2024-08-05 06:35:25,184][00139] DAMAGECOUNT value on done: 23957.0 [2024-08-05 06:35:25,185][00139] Sum rewards: -6.323, reward structure: {'DEATHCOUNT': '-11.250', 'HEALTH': '-1.486', 'AMMO4': '-0.021', 'AMMO2': '-0.004', 'AMMO5': '0.012', 'ARMOR': '0.060', 'weapon5': '0.098', 'AMMO3': '0.180', 'HITCOUNT': '0.230', 'WEAPON5': '0.300', 'DAMAGECOUNT': '0.732', 'WEAPON3': '0.850', 'FRAGCOUNT': '1.000', 'weapon2': '1.178', 'weapon3': '1.798'} [2024-08-05 06:35:25,410][00139] DAMAGECOUNT value on done: 23690.0 [2024-08-05 06:35:25,483][00034] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1138.5). Total num frames: 3555328. Throughput: 0: 288.6. Samples: 890319. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 06:35:25,484][00034] Avg episode reward: [(0, '-4.731')] [2024-08-05 06:35:30,483][00034] Fps is (10 sec: 819.2, 60 sec: 1228.8, 300 sec: 1166.3). Total num frames: 3563520. Throughput: 0: 288.3. Samples: 891155. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 06:35:30,484][00034] Avg episode reward: [(0, '-4.731')] [2024-08-05 06:35:35,483][00034] Fps is (10 sec: 1638.4, 60 sec: 1228.8, 300 sec: 1166.3). Total num frames: 3571712. Throughput: 0: 289.8. Samples: 892902. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 06:35:35,484][00034] Avg episode reward: [(0, '-4.731')] [2024-08-05 06:35:35,492][00132] Saving /kaggle/working/train_dir/default_experiment/checkpoint_p0/checkpoint_000000436_3571712.pth... [2024-08-05 06:35:35,563][00132] Removing /kaggle/working/train_dir/default_experiment/checkpoint_p0/checkpoint_000000402_3293184.pth [2024-08-05 06:35:39,755][00139] DAMAGECOUNT value on done: 24034.0 [2024-08-05 06:35:39,980][00139] DAMAGECOUNT value on done: 23820.0 [2024-08-05 06:35:39,981][00139] Sum rewards: -8.463, reward structure: {'DEATHCOUNT': '-12.000', 'HEALTH': '-1.088', 'FRAGCOUNT': '-0.500', 'AMMO5': '0.003', 'weapon5': '0.008', 'AMMO2': '0.014', 'WEAPON5': '0.050', 'AMMO4': '0.067', 'ARMOR': '0.096', 'HITCOUNT': '0.110', 'weapon4': '0.122', 'WEAPON4': '0.150', 'AMMO3': '0.203', 'DAMAGECOUNT': '0.390', 'WEAPON3': '1.050', 'weapon2': '1.402', 'weapon3': '1.460'} [2024-08-05 06:35:40,483][00034] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1138.5). Total num frames: 3571712. Throughput: 0: 289.6. Samples: 894639. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 06:35:40,484][00034] Avg episode reward: [(0, '-4.742')] [2024-08-05 06:35:45,483][00034] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1166.3). Total num frames: 3579904. Throughput: 0: 288.4. Samples: 895493. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 06:35:45,485][00034] Avg episode reward: [(0, '-4.742')] [2024-08-05 06:35:50,483][00034] Fps is (10 sec: 1638.4, 60 sec: 1228.8, 300 sec: 1166.3). Total num frames: 3588096. Throughput: 0: 286.8. Samples: 897205. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 06:35:50,487][00034] Avg episode reward: [(0, '-4.742')] [2024-08-05 06:35:54,529][00139] DAMAGECOUNT value on done: 24173.0 [2024-08-05 06:35:54,530][00139] Sum rewards: -0.723, reward structure: {'DEATHCOUNT': '-7.500', 'HEALTH': '-0.197', 'AMMO5': '0.007', 'AMMO2': '0.019', 'ARMOR': '0.044', 'weapon5': '0.056', 'weapon4': '0.088', 'HITCOUNT': '0.090', 'AMMO4': '0.093', 'AMMO3': '0.148', 'WEAPON4': '0.150', 'WEAPON5': '0.150', 'DAMAGECOUNT': '0.417', 'WEAPON3': '0.700', 'weapon2': '1.188', 'weapon3': '1.824', 'FRAGCOUNT': '2.000'} [2024-08-05 06:35:54,756][00139] DAMAGECOUNT value on done: 24065.0 [2024-08-05 06:35:54,757][00139] Sum rewards: -7.522, reward structure: {'DEATHCOUNT': '-12.750', 'HEALTH': '-1.328', 'weapon5': '0.008', 'AMMO5': '0.010', 'AMMO2': '0.010', 'ARMOR': '0.028', 'WEAPON4': '0.050', 'AMMO4': '0.051', 'WEAPON5': '0.100', 'HITCOUNT': '0.200', 'weapon4': '0.214', 'AMMO3': '0.229', 'DAMAGECOUNT': '0.735', 'FRAGCOUNT': '1.000', 'WEAPON3': '1.050', 'weapon3': '1.348', 'weapon2': '1.522'} [2024-08-05 06:35:55,483][00034] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1138.5). Total num frames: 3588096. Throughput: 0: 286.3. Samples: 898908. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 06:35:55,484][00034] Avg episode reward: [(0, '-4.725')] [2024-08-05 06:36:00,483][00034] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1138.5). Total num frames: 3596288. Throughput: 0: 286.4. Samples: 899776. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 06:36:00,484][00034] Avg episode reward: [(0, '-4.725')] [2024-08-05 06:36:02,836][00138] Updated weights for policy 0, policy_version 440 (0.0019) [2024-08-05 06:36:05,483][00034] Fps is (10 sec: 1638.4, 60 sec: 1228.8, 300 sec: 1166.3). Total num frames: 3604480. Throughput: 0: 288.8. Samples: 901544. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 06:36:05,484][00034] Avg episode reward: [(0, '-4.725')] [2024-08-05 06:36:08,873][00139] DAMAGECOUNT value on done: 24580.0 [2024-08-05 06:36:08,874][00139] Sum rewards: -0.065, reward structure: {'DEATHCOUNT': '-9.000', 'HEALTH': '-0.233', 'AMMO5': '0.005', 'AMMO2': '0.038', 'ARMOR': '0.052', 'weapon5': '0.052', 'weapon4': '0.134', 'AMMO3': '0.148', 'WEAPON5': '0.150', 'AMMO4': '0.187', 'HITCOUNT': '0.250', 'WEAPON4': '0.400', 'WEAPON3': '0.700', 'DAMAGECOUNT': '1.221', 'weapon2': '1.274', 'weapon3': '1.556', 'FRAGCOUNT': '3.000'} [2024-08-05 06:36:09,096][00139] DAMAGECOUNT value on done: 24185.0 [2024-08-05 06:36:10,483][00034] Fps is (10 sec: 1638.4, 60 sec: 1228.8, 300 sec: 1166.3). Total num frames: 3612672. Throughput: 0: 288.2. Samples: 903289. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 06:36:10,485][00034] Avg episode reward: [(0, '-4.669')] [2024-08-05 06:36:15,483][00034] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1138.5). Total num frames: 3612672. Throughput: 0: 289.4. Samples: 904179. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 06:36:15,484][00034] Avg episode reward: [(0, '-4.669')] [2024-08-05 06:36:20,483][00034] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1166.3). Total num frames: 3620864. Throughput: 0: 289.3. Samples: 905922. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 06:36:20,485][00034] Avg episode reward: [(0, '-4.669')] [2024-08-05 06:36:23,439][00139] DAMAGECOUNT value on done: 24705.0 [2024-08-05 06:36:23,440][00139] Sum rewards: -0.873, reward structure: {'DEATHCOUNT': '-5.250', 'HEALTH': '-1.514', 'AMMO4': '-0.029', 'AMMO2': '-0.006', 'AMMO5': '0.010', 'weapon4': '0.020', 'ARMOR': '0.056', 'weapon7': '0.072', 'WEAPON4': '0.100', 'AMMO3': '0.107', 'AMMO6': '0.120', 'AMMO7': '0.120', 'HITCOUNT': '0.120', 'WEAPON7': '0.200', 'DAMAGECOUNT': '0.375', 'WEAPON3': '0.650', 'FRAGCOUNT': '1.000', 'weapon3': '1.342', 'weapon2': '1.634'} [2024-08-05 06:36:23,672][00139] DAMAGECOUNT value on done: 24570.0 [2024-08-05 06:36:23,673][00139] Sum rewards: 0.348, reward structure: {'DEATHCOUNT': '-6.000', 'HEALTH': '-0.578', 'AMMO5': '0.003', 'AMMO2': '0.005', 'weapon5': '0.006', 'AMMO4': '0.025', 'weapon4': '0.028', 'WEAPON5': '0.050', 'WEAPON4': '0.050', 'ARMOR': '0.056', 'AMMO3': '0.074', 'HITCOUNT': '0.300', 'WEAPON3': '0.500', 'DAMAGECOUNT': '1.155', 'weapon3': '1.186', 'FRAGCOUNT': '1.500', 'weapon2': '1.988'} [2024-08-05 06:36:25,483][00034] Fps is (10 sec: 1638.4, 60 sec: 1228.8, 300 sec: 1166.3). Total num frames: 3629056. Throughput: 0: 288.0. Samples: 907600. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 06:36:25,484][00034] Avg episode reward: [(0, '-4.558')] [2024-08-05 06:36:30,483][00034] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1138.5). Total num frames: 3629056. Throughput: 0: 288.9. Samples: 908492. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 06:36:30,484][00034] Avg episode reward: [(0, '-4.558')] [2024-08-05 06:36:35,483][00034] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1166.3). Total num frames: 3637248. Throughput: 0: 288.9. Samples: 910205. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 06:36:35,484][00034] Avg episode reward: [(0, '-4.558')] [2024-08-05 06:36:37,948][00139] DAMAGECOUNT value on done: 25224.0 [2024-08-05 06:36:37,949][00139] Sum rewards: -0.558, reward structure: {'DEATHCOUNT': '-10.500', 'HEALTH': '-2.780', 'AMMO4': '-0.016', 'AMMO2': '-0.003', 'AMMO5': '0.005', 'weapon5': '0.010', 'ARMOR': '0.089', 'weapon4': '0.094', 'WEAPON5': '0.100', 'WEAPON4': '0.100', 'AMMO3': '0.198', 'HITCOUNT': '0.320', 'WEAPON3': '1.150', 'weapon3': '1.490', 'DAMAGECOUNT': '1.557', 'weapon2': '1.628', 'FRAGCOUNT': '6.000'} [2024-08-05 06:36:38,241][00139] DAMAGECOUNT value on done: 24805.0 [2024-08-05 06:36:38,241][00139] Sum rewards: -7.814, reward structure: {'DEATHCOUNT': '-12.750', 'HEALTH': '-1.308', 'AMMO5': '0.007', 'WEAPON1': '0.010', 'weapon4': '0.014', 'AMMO2': '0.014', 'weapon5': '0.020', 'ARMOR': '0.036', 'AMMO4': '0.070', 'WEAPON4': '0.100', 'HITCOUNT': '0.150', 'WEAPON5': '0.150', 'AMMO3': '0.210', 'FRAGCOUNT': '0.500', 'DAMAGECOUNT': '0.705', 'weapon2': '1.024', 'WEAPON3': '1.050', 'weapon3': '2.184'} [2024-08-05 06:36:40,483][00034] Fps is (10 sec: 1638.4, 60 sec: 1228.8, 300 sec: 1166.3). Total num frames: 3645440. Throughput: 0: 289.5. Samples: 911936. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 06:36:40,484][00034] Avg episode reward: [(0, '-4.576')] [2024-08-05 06:36:45,483][00034] Fps is (10 sec: 1638.4, 60 sec: 1228.8, 300 sec: 1166.3). Total num frames: 3653632. Throughput: 0: 289.9. Samples: 912822. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 06:36:45,485][00034] Avg episode reward: [(0, '-4.576')] [2024-08-05 06:36:50,483][00034] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1138.5). Total num frames: 3653632. Throughput: 0: 289.4. Samples: 914569. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 06:36:50,484][00034] Avg episode reward: [(0, '-4.576')] [2024-08-05 06:36:52,581][00139] DAMAGECOUNT value on done: 25352.0 [2024-08-05 06:36:52,858][00139] DAMAGECOUNT value on done: 25075.0 [2024-08-05 06:36:52,859][00139] Sum rewards: 3.028, reward structure: {'DEATHCOUNT': '-4.500', 'HEALTH': '-0.170', 'AMMO5': '0.005', 'WEAPON1': '0.010', 'AMMO2': '0.011', 'weapon4': '0.044', 'AMMO4': '0.057', 'weapon5': '0.088', 'weapon7': '0.094', 'ARMOR': '0.097', 'AMMO6': '0.100', 'AMMO7': '0.100', 'WEAPON7': '0.100', 'WEAPON4': '0.100', 'WEAPON5': '0.100', 'AMMO3': '0.104', 'HITCOUNT': '0.130', 'WEAPON3': '0.500', 'DAMAGECOUNT': '0.810', 'weapon2': '1.150', 'weapon3': '1.598', 'FRAGCOUNT': '2.500'} [2024-08-05 06:36:55,483][00034] Fps is (10 sec: 819.2, 60 sec: 1228.8, 300 sec: 1166.3). Total num frames: 3661824. Throughput: 0: 287.9. Samples: 916244. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 06:36:55,485][00034] Avg episode reward: [(0, '-4.506')] [2024-08-05 06:36:55,493][00132] Saving new best policy, reward=-4.506! [2024-08-05 06:37:00,483][00034] Fps is (10 sec: 1638.4, 60 sec: 1228.8, 300 sec: 1166.3). Total num frames: 3670016. Throughput: 0: 288.3. Samples: 917152. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 06:37:00,485][00034] Avg episode reward: [(0, '-4.506')] [2024-08-05 06:37:05,483][00034] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1138.5). Total num frames: 3670016. Throughput: 0: 287.5. Samples: 918860. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 06:37:05,485][00034] Avg episode reward: [(0, '-4.506')] [2024-08-05 06:37:07,152][00139] DAMAGECOUNT value on done: 25519.0 [2024-08-05 06:37:07,383][00139] DAMAGECOUNT value on done: 25238.0 [2024-08-05 06:37:07,383][00139] Sum rewards: -3.267, reward structure: {'DEATHCOUNT': '-7.500', 'HEALTH': '-0.658', 'FRAGCOUNT': '-0.500', 'AMMO5': '0.005', 'weapon4': '0.016', 'AMMO2': '0.020', 'ARMOR': '0.032', 'weapon7': '0.062', 'weapon5': '0.064', 'AMMO4': '0.098', 'AMMO6': '0.100', 'AMMO7': '0.100', 'WEAPON7': '0.100', 'AMMO3': '0.115', 'HITCOUNT': '0.140', 'WEAPON5': '0.150', 'WEAPON4': '0.150', 'DAMAGECOUNT': '0.489', 'WEAPON3': '0.700', 'weapon3': '1.504', 'weapon2': '1.546'} [2024-08-05 06:37:10,483][00034] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1166.3). Total num frames: 3678208. Throughput: 0: 289.2. Samples: 920616. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 06:37:10,484][00034] Avg episode reward: [(0, '-4.500')] [2024-08-05 06:37:10,486][00132] Saving new best policy, reward=-4.500! [2024-08-05 06:37:13,666][00138] Updated weights for policy 0, policy_version 450 (0.0017) [2024-08-05 06:37:15,483][00034] Fps is (10 sec: 1638.4, 60 sec: 1228.8, 300 sec: 1166.3). Total num frames: 3686400. Throughput: 0: 288.6. Samples: 921479. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 06:37:15,485][00034] Avg episode reward: [(0, '-4.500')] [2024-08-05 06:37:20,483][00034] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1138.5). Total num frames: 3686400. Throughput: 0: 288.6. Samples: 923193. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 06:37:20,484][00034] Avg episode reward: [(0, '-4.500')] [2024-08-05 06:37:21,763][00139] DAMAGECOUNT value on done: 25677.0 [2024-08-05 06:37:21,763][00139] Sum rewards: -2.506, reward structure: {'DEATHCOUNT': '-7.500', 'HEALTH': '-0.930', 'AMMO5': '0.003', 'AMMO2': '0.006', 'weapon4': '0.014', 'weapon5': '0.022', 'ARMOR': '0.024', 'AMMO4': '0.029', 'WEAPON5': '0.050', 'HITCOUNT': '0.100', 'WEAPON4': '0.100', 'AMMO3': '0.160', 'DAMAGECOUNT': '0.474', 'WEAPON3': '0.700', 'FRAGCOUNT': '1.000', 'weapon2': '1.608', 'weapon3': '1.634'} [2024-08-05 06:37:21,985][00139] DAMAGECOUNT value on done: 25483.0 [2024-08-05 06:37:21,986][00139] Sum rewards: -1.883, reward structure: {'DEATHCOUNT': '-9.000', 'HEALTH': '-1.300', 'AMMO5': '0.007', 'AMMO2': '0.025', 'weapon5': '0.026', 'ARMOR': '0.068', 'WEAPON5': '0.100', 'AMMO3': '0.121', 'AMMO4': '0.126', 'weapon4': '0.144', 'HITCOUNT': '0.240', 'WEAPON4': '0.250', 'WEAPON3': '0.600', 'DAMAGECOUNT': '0.735', 'weapon3': '1.430', 'weapon2': '1.544', 'FRAGCOUNT': '3.000'} [2024-08-05 06:37:25,483][00034] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1166.3). Total num frames: 3694592. Throughput: 0: 287.1. Samples: 924855. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 06:37:25,484][00034] Avg episode reward: [(0, '-4.475')] [2024-08-05 06:37:25,492][00132] Saving new best policy, reward=-4.475! [2024-08-05 06:37:30,483][00034] Fps is (10 sec: 1638.4, 60 sec: 1228.8, 300 sec: 1166.3). Total num frames: 3702784. Throughput: 0: 286.8. Samples: 925726. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 06:37:30,485][00034] Avg episode reward: [(0, '-4.475')] [2024-08-05 06:37:35,483][00034] Fps is (10 sec: 1638.4, 60 sec: 1228.8, 300 sec: 1166.3). Total num frames: 3710976. Throughput: 0: 287.0. Samples: 927485. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 06:37:35,484][00034] Avg episode reward: [(0, '-4.475')] [2024-08-05 06:37:35,491][00132] Saving /kaggle/working/train_dir/default_experiment/checkpoint_p0/checkpoint_000000453_3710976.pth... [2024-08-05 06:37:35,564][00132] Removing /kaggle/working/train_dir/default_experiment/checkpoint_p0/checkpoint_000000419_3432448.pth [2024-08-05 06:37:36,435][00139] DAMAGECOUNT value on done: 25907.0 [2024-08-05 06:37:36,436][00139] Sum rewards: -4.296, reward structure: {'DEATHCOUNT': '-9.750', 'HEALTH': '-2.071', 'AMMO2': '0.007', 'AMMO5': '0.007', 'ARMOR': '0.032', 'AMMO4': '0.034', 'weapon4': '0.052', 'WEAPON4': '0.100', 'WEAPON5': '0.150', 'HITCOUNT': '0.170', 'AMMO3': '0.173', 'DAMAGECOUNT': '0.690', 'WEAPON3': '0.850', 'weapon2': '1.186', 'FRAGCOUNT': '2.000', 'weapon3': '2.074'} [2024-08-05 06:37:36,672][00139] DAMAGECOUNT value on done: 25818.0 [2024-08-05 06:37:36,673][00139] Sum rewards: 2.800, reward structure: {'DEATHCOUNT': '-6.000', 'weapon5': '0.006', 'AMMO5': '0.010', 'AMMO2': '0.014', 'weapon4': '0.014', 'ARMOR': '0.060', 'AMMO4': '0.068', 'WEAPON4': '0.100', 'AMMO3': '0.109', 'WEAPON5': '0.150', 'HITCOUNT': '0.220', 'HEALTH': '0.440', 'WEAPON3': '0.450', 'DAMAGECOUNT': '1.005', 'weapon2': '1.076', 'weapon3': '2.078', 'FRAGCOUNT': '3.000'} [2024-08-05 06:37:40,483][00034] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1138.5). Total num frames: 3710976. Throughput: 0: 287.8. Samples: 929193. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 06:37:40,485][00034] Avg episode reward: [(0, '-4.375')] [2024-08-05 06:37:40,486][00132] Saving new best policy, reward=-4.375! [2024-08-05 06:37:45,483][00034] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1166.3). Total num frames: 3719168. Throughput: 0: 286.7. Samples: 930055. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 06:37:45,484][00034] Avg episode reward: [(0, '-4.375')] [2024-08-05 06:37:50,483][00034] Fps is (10 sec: 1638.4, 60 sec: 1228.8, 300 sec: 1166.3). Total num frames: 3727360. Throughput: 0: 287.3. Samples: 931788. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 06:37:50,485][00034] Avg episode reward: [(0, '-4.375')] [2024-08-05 06:37:51,027][00139] DAMAGECOUNT value on done: 26222.0 [2024-08-05 06:37:51,028][00139] Sum rewards: -5.011, reward structure: {'DEATHCOUNT': '-9.750', 'HEALTH': '-1.878', 'AMMO4': '-0.030', 'AMMO2': '-0.006', 'AMMO5': '0.005', 'WEAPON5': '0.100', 'AMMO3': '0.150', 'HITCOUNT': '0.260', 'WEAPON3': '0.800', 'DAMAGECOUNT': '0.945', 'FRAGCOUNT': '1.000', 'weapon3': '1.424', 'weapon2': '1.968'} [2024-08-05 06:37:51,257][00139] DAMAGECOUNT value on done: 26178.0 [2024-08-05 06:37:51,258][00139] Sum rewards: -3.832, reward structure: {'DEATHCOUNT': '-10.500', 'HEALTH': '-2.002', 'AMMO2': '0.008', 'WEAPON1': '0.020', 'weapon5': '0.022', 'AMMO5': '0.023', 'AMMO4': '0.040', 'weapon7': '0.044', 'ARMOR': '0.048', 'weapon4': '0.070', 'WEAPON4': '0.150', 'AMMO6': '0.160', 'AMMO7': '0.160', 'AMMO3': '0.167', 'WEAPON7': '0.200', 'HITCOUNT': '0.250', 'WEAPON5': '0.300', 'WEAPON3': '0.850', 'DAMAGECOUNT': '1.080', 'weapon3': '1.352', 'weapon2': '1.726', 'FRAGCOUNT': '2.000'} [2024-08-05 06:37:55,483][00034] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1138.5). Total num frames: 3727360. Throughput: 0: 286.5. Samples: 933510. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 06:37:55,484][00034] Avg episode reward: [(0, '-4.354')] [2024-08-05 06:37:55,493][00132] Saving new best policy, reward=-4.354! [2024-08-05 06:38:00,483][00034] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1166.3). Total num frames: 3735552. Throughput: 0: 285.4. Samples: 934321. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 06:38:00,484][00034] Avg episode reward: [(0, '-4.354')] [2024-08-05 06:38:05,483][00034] Fps is (10 sec: 1638.4, 60 sec: 1228.8, 300 sec: 1166.3). Total num frames: 3743744. Throughput: 0: 284.4. Samples: 935990. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 06:38:05,484][00034] Avg episode reward: [(0, '-4.354')] [2024-08-05 06:38:05,999][00139] DAMAGECOUNT value on done: 26301.0 [2024-08-05 06:38:06,248][00139] DAMAGECOUNT value on done: 26462.0 [2024-08-05 06:38:06,249][00139] Sum rewards: -4.785, reward structure: {'DEATHCOUNT': '-12.000', 'HEALTH': '-2.342', 'weapon5': '0.012', 'AMMO2': '0.014', 'AMMO5': '0.030', 'weapon4': '0.040', 'AMMO4': '0.070', 'WEAPON4': '0.100', 'AMMO3': '0.201', 'HITCOUNT': '0.250', 'WEAPON5': '0.400', 'DAMAGECOUNT': '0.852', 'WEAPON3': '1.150', 'weapon2': '1.364', 'weapon3': '2.074', 'FRAGCOUNT': '3.000'} [2024-08-05 06:38:10,483][00034] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1138.5). Total num frames: 3743744. Throughput: 0: 286.1. Samples: 937729. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 06:38:10,487][00034] Avg episode reward: [(0, '-4.324')] [2024-08-05 06:38:10,488][00132] Saving new best policy, reward=-4.324! [2024-08-05 06:38:15,483][00034] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1166.3). Total num frames: 3751936. Throughput: 0: 286.6. Samples: 938623. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 06:38:15,484][00034] Avg episode reward: [(0, '-4.324')] [2024-08-05 06:38:20,425][00139] DAMAGECOUNT value on done: 26396.0 [2024-08-05 06:38:20,483][00034] Fps is (10 sec: 1638.4, 60 sec: 1228.8, 300 sec: 1166.3). Total num frames: 3760128. Throughput: 0: 285.8. Samples: 940347. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 06:38:20,484][00034] Avg episode reward: [(0, '-4.259')] [2024-08-05 06:38:20,486][00132] Saving new best policy, reward=-4.259! [2024-08-05 06:38:20,664][00139] DAMAGECOUNT value on done: 26574.0 [2024-08-05 06:38:25,140][00138] Updated weights for policy 0, policy_version 460 (0.0017) [2024-08-05 06:38:25,483][00034] Fps is (10 sec: 1638.4, 60 sec: 1228.8, 300 sec: 1166.3). Total num frames: 3768320. Throughput: 0: 286.6. Samples: 942090. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 06:38:25,485][00034] Avg episode reward: [(0, '-4.200')] [2024-08-05 06:38:25,493][00132] Saving new best policy, reward=-4.200! [2024-08-05 06:38:30,483][00034] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1138.5). Total num frames: 3768320. Throughput: 0: 285.4. Samples: 942900. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 06:38:30,485][00034] Avg episode reward: [(0, '-4.200')] [2024-08-05 06:38:35,302][00139] DAMAGECOUNT value on done: 26423.0 [2024-08-05 06:38:35,483][00034] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1166.3). Total num frames: 3776512. Throughput: 0: 284.4. Samples: 944588. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 06:38:35,484][00034] Avg episode reward: [(0, '-4.236')] [2024-08-05 06:38:35,539][00139] DAMAGECOUNT value on done: 27089.0 [2024-08-05 06:38:35,539][00139] Sum rewards: -1.227, reward structure: {'DEATHCOUNT': '-9.000', 'HEALTH': '-1.717', 'AMMO5': '0.005', 'weapon5': '0.006', 'AMMO2': '0.009', 'ARMOR': '0.036', 'AMMO4': '0.045', 'WEAPON5': '0.100', 'AMMO3': '0.147', 'HITCOUNT': '0.340', 'weapon2': '0.966', 'WEAPON3': '1.000', 'DAMAGECOUNT': '1.545', 'weapon3': '2.290', 'FRAGCOUNT': '3.000'} [2024-08-05 06:38:40,483][00034] Fps is (10 sec: 1638.4, 60 sec: 1228.8, 300 sec: 1166.3). Total num frames: 3784704. Throughput: 0: 283.9. Samples: 946285. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 06:38:40,487][00034] Avg episode reward: [(0, '-4.212')] [2024-08-05 06:38:45,483][00034] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1138.5). Total num frames: 3784704. Throughput: 0: 285.2. Samples: 947156. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 06:38:45,484][00034] Avg episode reward: [(0, '-4.212')] [2024-08-05 06:38:50,140][00139] DAMAGECOUNT value on done: 26625.0 [2024-08-05 06:38:50,140][00139] Sum rewards: 0.851, reward structure: {'DEATHCOUNT': '-4.500', 'HEALTH': '-0.775', 'AMMO2': '0.011', 'AMMO5': '0.021', 'AMMO4': '0.055', 'AMMO3': '0.073', 'HITCOUNT': '0.090', 'weapon5': '0.102', 'weapon4': '0.184', 'WEAPON4': '0.200', 'WEAPON5': '0.250', 'WEAPON3': '0.450', 'ARMOR': '0.500', 'DAMAGECOUNT': '0.606', 'FRAGCOUNT': '1.000', 'weapon3': '1.280', 'weapon2': '1.304'} [2024-08-05 06:38:50,369][00139] DAMAGECOUNT value on done: 27325.0 [2024-08-05 06:38:50,369][00139] Sum rewards: -3.298, reward structure: {'DEATHCOUNT': '-9.000', 'HEALTH': '-2.754', 'AMMO2': '0.010', 'AMMO5': '0.017', 'weapon7': '0.040', 'ARMOR': '0.048', 'AMMO4': '0.052', 'weapon5': '0.056', 'HITCOUNT': '0.130', 'WEAPON4': '0.150', 'AMMO6': '0.160', 'AMMO7': '0.160', 'weapon4': '0.166', 'AMMO3': '0.190', 'WEAPON7': '0.200', 'WEAPON5': '0.300', 'DAMAGECOUNT': '0.708', 'WEAPON3': '0.900', 'weapon3': '1.330', 'weapon2': '1.838', 'FRAGCOUNT': '2.000'} [2024-08-05 06:38:50,483][00034] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1166.3). Total num frames: 3792896. Throughput: 0: 285.6. Samples: 948844. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 06:38:50,484][00034] Avg episode reward: [(0, '-4.158')] [2024-08-05 06:38:50,486][00132] Saving new best policy, reward=-4.158! [2024-08-05 06:38:55,483][00034] Fps is (10 sec: 1638.4, 60 sec: 1228.8, 300 sec: 1166.3). Total num frames: 3801088. Throughput: 0: 284.6. Samples: 950536. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 06:38:55,485][00034] Avg episode reward: [(0, '-4.158')] [2024-08-05 06:39:00,483][00034] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1138.5). Total num frames: 3801088. Throughput: 0: 283.9. Samples: 951397. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 06:39:00,485][00034] Avg episode reward: [(0, '-4.158')] [2024-08-05 06:39:05,017][00139] DAMAGECOUNT value on done: 26800.0 [2024-08-05 06:39:05,018][00139] Sum rewards: -4.398, reward structure: {'DEATHCOUNT': '-9.750', 'HEALTH': '-1.290', 'AMMO4': '-0.029', 'AMMO2': '-0.006', 'AMMO5': '0.028', 'WEAPON4': '0.100', 'AMMO3': '0.164', 'HITCOUNT': '0.170', 'weapon4': '0.176', 'WEAPON5': '0.300', 'ARMOR': '0.498', 'DAMAGECOUNT': '0.525', 'WEAPON3': '0.600', 'FRAGCOUNT': '1.000', 'weapon3': '1.420', 'weapon2': '1.696'} [2024-08-05 06:39:05,235][00139] DAMAGECOUNT value on done: 27420.0 [2024-08-05 06:39:05,483][00034] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1138.5). Total num frames: 3809280. Throughput: 0: 282.5. Samples: 953061. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 06:39:05,484][00034] Avg episode reward: [(0, '-4.195')] [2024-08-05 06:39:10,483][00034] Fps is (10 sec: 1638.4, 60 sec: 1228.8, 300 sec: 1166.3). Total num frames: 3817472. Throughput: 0: 281.9. Samples: 954777. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 06:39:10,485][00034] Avg episode reward: [(0, '-4.195')] [2024-08-05 06:39:15,483][00034] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1138.5). Total num frames: 3817472. Throughput: 0: 282.4. Samples: 955610. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 06:39:15,484][00034] Avg episode reward: [(0, '-4.195')] [2024-08-05 06:39:19,791][00139] DAMAGECOUNT value on done: 26887.0 [2024-08-05 06:39:19,791][00139] Sum rewards: -7.262, reward structure: {'DEATHCOUNT': '-8.250', 'HEALTH': '-2.072', 'FRAGCOUNT': '-1.500', 'AMMO4': '-0.034', 'AMMO2': '-0.007', 'ARMOR': '0.016', 'AMMO5': '0.018', 'WEAPON4': '0.050', 'weapon5': '0.062', 'HITCOUNT': '0.080', 'weapon4': '0.106', 'AMMO3': '0.140', 'WEAPON5': '0.200', 'DAMAGECOUNT': '0.261', 'WEAPON3': '0.650', 'weapon3': '1.172', 'weapon2': '1.846'} [2024-08-05 06:39:20,045][00139] DAMAGECOUNT value on done: 27777.0 [2024-08-05 06:39:20,045][00139] Sum rewards: -0.563, reward structure: {'DEATHCOUNT': '-8.250', 'HEALTH': '-1.700', 'weapon5': '0.002', 'AMMO2': '0.008', 'AMMO5': '0.010', 'AMMO4': '0.041', 'weapon4': '0.044', 'ARMOR': '0.094', 'WEAPON4': '0.100', 'WEAPON5': '0.150', 'AMMO3': '0.158', 'HITCOUNT': '0.270', 'WEAPON3': '1.000', 'DAMAGECOUNT': '1.071', 'weapon2': '1.210', 'weapon3': '2.228', 'FRAGCOUNT': '3.000'} [2024-08-05 06:39:20,483][00034] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1138.5). Total num frames: 3825664. Throughput: 0: 283.5. Samples: 957345. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 06:39:20,485][00034] Avg episode reward: [(0, '-4.198')] [2024-08-05 06:39:25,483][00034] Fps is (10 sec: 1638.4, 60 sec: 1092.3, 300 sec: 1166.3). Total num frames: 3833856. Throughput: 0: 283.7. Samples: 959052. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 06:39:25,484][00034] Avg episode reward: [(0, '-4.198')] [2024-08-05 06:39:30,483][00034] Fps is (10 sec: 1638.4, 60 sec: 1228.8, 300 sec: 1166.3). Total num frames: 3842048. Throughput: 0: 283.2. Samples: 959902. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 06:39:30,484][00034] Avg episode reward: [(0, '-4.198')] [2024-08-05 06:39:34,774][00139] DAMAGECOUNT value on done: 26977.0 [2024-08-05 06:39:34,775][00139] Sum rewards: -7.675, reward structure: {'DEATHCOUNT': '-12.000', 'HEALTH': '-0.718', 'FRAGCOUNT': '-0.500', 'AMMO5': '0.008', 'WEAPON1': '0.010', 'weapon4': '0.024', 'weapon5': '0.024', 'AMMO2': '0.027', 'ARMOR': '0.064', 'HITCOUNT': '0.090', 'AMMO4': '0.134', 'WEAPON5': '0.150', 'AMMO3': '0.186', 'WEAPON4': '0.200', 'DAMAGECOUNT': '0.270', 'WEAPON3': '1.000', 'weapon2': '1.040', 'weapon3': '2.316'} [2024-08-05 06:39:35,007][00139] DAMAGECOUNT value on done: 27928.0 [2024-08-05 06:39:35,008][00139] Sum rewards: -3.799, reward structure: {'DEATHCOUNT': '-9.000', 'FRAGCOUNT': '-0.500', 'HEALTH': '-0.217', 'AMMO5': '0.020', 'AMMO2': '0.032', 'HITCOUNT': '0.060', 'weapon4': '0.080', 'weapon7': '0.082', 'ARMOR': '0.088', 'weapon5': '0.104', 'AMMO6': '0.120', 'AMMO7': '0.120', 'AMMO3': '0.122', 'AMMO4': '0.161', 'WEAPON4': '0.200', 'WEAPON7': '0.200', 'WEAPON5': '0.350', 'DAMAGECOUNT': '0.453', 'WEAPON3': '0.600', 'weapon2': '1.412', 'weapon3': '1.714'} [2024-08-05 06:39:35,483][00034] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1138.5). Total num frames: 3842048. Throughput: 0: 282.4. Samples: 961551. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 06:39:35,484][00034] Avg episode reward: [(0, '-4.209')] [2024-08-05 06:39:35,493][00132] Saving /kaggle/working/train_dir/default_experiment/checkpoint_p0/checkpoint_000000469_3842048.pth... [2024-08-05 06:39:35,576][00132] Removing /kaggle/working/train_dir/default_experiment/checkpoint_p0/checkpoint_000000436_3571712.pth [2024-08-05 06:39:37,652][00138] Updated weights for policy 0, policy_version 470 (0.0018) [2024-08-05 06:39:40,483][00034] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1138.5). Total num frames: 3850240. Throughput: 0: 283.3. Samples: 963286. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 06:39:40,484][00034] Avg episode reward: [(0, '-4.209')] [2024-08-05 06:39:45,483][00034] Fps is (10 sec: 1638.4, 60 sec: 1228.8, 300 sec: 1166.3). Total num frames: 3858432. Throughput: 0: 283.9. Samples: 964173. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 06:39:45,484][00034] Avg episode reward: [(0, '-4.209')] [2024-08-05 06:39:49,274][00139] DAMAGECOUNT value on done: 27102.0 [2024-08-05 06:39:49,275][00139] Sum rewards: -3.250, reward structure: {'DEATHCOUNT': '-8.250', 'HEALTH': '-1.500', 'AMMO2': '0.013', 'AMMO5': '0.016', 'ARMOR': '0.040', 'weapon5': '0.042', 'AMMO4': '0.065', 'WEAPON4': '0.100', 'HITCOUNT': '0.120', 'AMMO3': '0.176', 'WEAPON5': '0.250', 'DAMAGECOUNT': '0.375', 'WEAPON3': '0.900', 'FRAGCOUNT': '1.000', 'weapon2': '1.342', 'weapon3': '2.060'} [2024-08-05 06:39:49,517][00139] DAMAGECOUNT value on done: 28326.0 [2024-08-05 06:39:49,518][00139] Sum rewards: -0.668, reward structure: {'DEATHCOUNT': '-8.250', 'HEALTH': '-2.162', 'AMMO4': '-0.091', 'AMMO2': '-0.018', 'WEAPON1': '0.020', 'AMMO5': '0.022', 'AMMO3': '0.105', 'weapon5': '0.124', 'HITCOUNT': '0.180', 'WEAPON5': '0.350', 'ARMOR': '0.600', 'WEAPON3': '0.750', 'DAMAGECOUNT': '1.194', 'weapon2': '1.680', 'weapon3': '1.828', 'FRAGCOUNT': '3.000'} [2024-08-05 06:39:50,483][00034] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1138.5). Total num frames: 3858432. Throughput: 0: 285.1. Samples: 965891. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 06:39:50,484][00034] Avg episode reward: [(0, '-4.170')] [2024-08-05 06:39:55,483][00034] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1138.5). Total num frames: 3866624. Throughput: 0: 284.5. Samples: 967581. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 06:39:55,484][00034] Avg episode reward: [(0, '-4.170')] [2024-08-05 06:40:00,483][00034] Fps is (10 sec: 1638.4, 60 sec: 1228.8, 300 sec: 1166.3). Total num frames: 3874816. Throughput: 0: 285.5. Samples: 968456. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 06:40:00,484][00034] Avg episode reward: [(0, '-4.170')] [2024-08-05 06:40:04,044][00139] DAMAGECOUNT value on done: 27452.0 [2024-08-05 06:40:04,044][00139] Sum rewards: -3.667, reward structure: {'DEATHCOUNT': '-9.000', 'HEALTH': '-1.887', 'AMMO2': '0.007', 'AMMO5': '0.018', 'AMMO4': '0.033', 'weapon4': '0.038', 'ARMOR': '0.060', 'weapon5': '0.074', 'WEAPON4': '0.100', 'AMMO3': '0.171', 'HITCOUNT': '0.240', 'WEAPON5': '0.250', 'WEAPON3': '0.900', 'FRAGCOUNT': '1.000', 'weapon2': '1.014', 'DAMAGECOUNT': '1.050', 'weapon3': '2.266'} [2024-08-05 06:40:04,276][00139] DAMAGECOUNT value on done: 28455.0 [2024-08-05 06:40:05,483][00034] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1138.5). Total num frames: 3874816. Throughput: 0: 284.7. Samples: 970158. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 06:40:05,484][00034] Avg episode reward: [(0, '-4.169')] [2024-08-05 06:40:10,483][00034] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1138.5). Total num frames: 3883008. Throughput: 0: 284.4. Samples: 971848. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 06:40:10,485][00034] Avg episode reward: [(0, '-4.169')] [2024-08-05 06:40:15,483][00034] Fps is (10 sec: 1638.4, 60 sec: 1228.8, 300 sec: 1138.5). Total num frames: 3891200. Throughput: 0: 285.0. Samples: 972728. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 06:40:15,485][00034] Avg episode reward: [(0, '-4.169')] [2024-08-05 06:40:18,614][00139] DAMAGECOUNT value on done: 28219.0 [2024-08-05 06:40:18,615][00139] Sum rewards: 2.578, reward structure: {'DEATHCOUNT': '-9.000', 'HEALTH': '-0.260', 'AMMO2': '0.006', 'AMMO5': '0.010', 'AMMO4': '0.028', 'ARMOR': '0.040', 'weapon5': '0.080', 'AMMO3': '0.161', 'WEAPON5': '0.200', 'HITCOUNT': '0.420', 'WEAPON3': '0.800', 'weapon2': '1.460', 'weapon3': '1.832', 'DAMAGECOUNT': '2.301', 'FRAGCOUNT': '4.500'} [2024-08-05 06:40:18,877][00139] DAMAGECOUNT value on done: 28845.0 [2024-08-05 06:40:18,878][00139] Sum rewards: -2.950, reward structure: {'DEATHCOUNT': '-12.000', 'HEALTH': '-2.952', 'AMMO4': '-0.020', 'AMMO2': '-0.004', 'AMMO5': '0.018', 'weapon5': '0.030', 'ARMOR': '0.032', 'WEAPON4': '0.100', 'weapon4': '0.126', 'WEAPON5': '0.200', 'AMMO3': '0.248', 'HITCOUNT': '0.290', 'DAMAGECOUNT': '1.170', 'WEAPON3': '1.300', 'weapon2': '1.680', 'weapon3': '1.832', 'FRAGCOUNT': '5.000'} [2024-08-05 06:40:20,483][00034] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1138.5). Total num frames: 3891200. Throughput: 0: 287.0. Samples: 974464. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 06:40:20,485][00034] Avg episode reward: [(0, '-4.145')] [2024-08-05 06:40:20,487][00132] Saving new best policy, reward=-4.145! [2024-08-05 06:40:25,483][00034] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1138.5). Total num frames: 3899392. Throughput: 0: 286.0. Samples: 976157. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 06:40:25,485][00034] Avg episode reward: [(0, '-4.145')] [2024-08-05 06:40:30,483][00034] Fps is (10 sec: 1638.4, 60 sec: 1092.3, 300 sec: 1138.5). Total num frames: 3907584. Throughput: 0: 285.4. Samples: 977015. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 06:40:30,484][00034] Avg episode reward: [(0, '-4.145')] [2024-08-05 06:40:33,367][00139] DAMAGECOUNT value on done: 28559.0 [2024-08-05 06:40:33,368][00139] Sum rewards: -5.620, reward structure: {'DEATHCOUNT': '-12.000', 'HEALTH': '-1.960', 'weapon5': '0.012', 'AMMO5': '0.020', 'ARMOR': '0.028', 'AMMO2': '0.042', 'AMMO3': '0.172', 'WEAPON5': '0.200', 'AMMO4': '0.210', 'weapon4': '0.220', 'WEAPON4': '0.250', 'HITCOUNT': '0.250', 'WEAPON3': '0.850', 'DAMAGECOUNT': '1.020', 'weapon2': '1.518', 'weapon3': '1.548', 'FRAGCOUNT': '2.000'} [2024-08-05 06:40:33,584][00139] DAMAGECOUNT value on done: 29045.0 [2024-08-05 06:40:33,585][00139] Sum rewards: -3.208, reward structure: {'DEATHCOUNT': '-9.000', 'HEALTH': '-0.721', 'AMMO5': '0.005', 'AMMO2': '0.012', 'WEAPON1': '0.020', 'AMMO4': '0.059', 'WEAPON5': '0.100', 'AMMO3': '0.134', 'weapon4': '0.152', 'HITCOUNT': '0.160', 'WEAPON4': '0.200', 'ARMOR': '0.519', 'DAMAGECOUNT': '0.600', 'WEAPON3': '0.800', 'FRAGCOUNT': '1.000', 'weapon2': '1.230', 'weapon3': '1.522'} [2024-08-05 06:40:35,483][00034] Fps is (10 sec: 1638.4, 60 sec: 1228.8, 300 sec: 1166.3). Total num frames: 3915776. Throughput: 0: 285.1. Samples: 978722. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 06:40:35,485][00034] Avg episode reward: [(0, '-4.150')] [2024-08-05 06:40:40,483][00034] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1138.5). Total num frames: 3915776. Throughput: 0: 285.3. Samples: 980420. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 06:40:40,485][00034] Avg episode reward: [(0, '-4.150')] [2024-08-05 06:40:45,483][00034] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1138.5). Total num frames: 3923968. Throughput: 0: 285.2. Samples: 981292. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 06:40:45,484][00034] Avg episode reward: [(0, '-4.150')] [2024-08-05 06:40:47,974][00139] DAMAGECOUNT value on done: 28814.0 [2024-08-05 06:40:47,975][00139] Sum rewards: 0.060, reward structure: {'DEATHCOUNT': '-6.750', 'HEALTH': '-0.416', 'AMMO5': '0.005', 'AMMO2': '0.006', 'weapon5': '0.008', 'WEAPON1': '0.020', 'AMMO4': '0.031', 'WEAPON5': '0.050', 'WEAPON4': '0.100', 'AMMO3': '0.122', 'ARMOR': '0.130', 'HITCOUNT': '0.160', 'weapon4': '0.250', 'WEAPON3': '0.650', 'DAMAGECOUNT': '0.765', 'weapon2': '1.330', 'weapon3': '1.598', 'FRAGCOUNT': '2.000'} [2024-08-05 06:40:48,182][00139] DAMAGECOUNT value on done: 29388.0 [2024-08-05 06:40:48,182][00139] Sum rewards: -3.523, reward structure: {'DEATHCOUNT': '-9.750', 'HEALTH': '-2.830', 'weapon5': '0.006', 'AMMO5': '0.007', 'weapon4': '0.010', 'ARMOR': '0.016', 'WEAPON1': '0.020', 'AMMO2': '0.023', 'AMMO4': '0.114', 'AMMO3': '0.150', 'WEAPON5': '0.150', 'HITCOUNT': '0.230', 'WEAPON4': '0.350', 'WEAPON3': '0.900', 'DAMAGECOUNT': '1.029', 'weapon2': '1.426', 'weapon3': '1.626', 'FRAGCOUNT': '3.000'} [2024-08-05 06:40:49,142][00138] Updated weights for policy 0, policy_version 480 (0.0017) [2024-08-05 06:40:50,483][00034] Fps is (10 sec: 1638.4, 60 sec: 1228.8, 300 sec: 1166.3). Total num frames: 3932160. Throughput: 0: 286.5. Samples: 983050. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 06:40:50,485][00034] Avg episode reward: [(0, '-4.103')] [2024-08-05 06:40:50,486][00132] Saving new best policy, reward=-4.103! [2024-08-05 06:40:55,483][00034] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1138.5). Total num frames: 3932160. Throughput: 0: 288.5. Samples: 984832. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 06:40:55,484][00034] Avg episode reward: [(0, '-4.103')] [2024-08-05 06:41:00,483][00034] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1138.5). Total num frames: 3940352. Throughput: 0: 288.0. Samples: 985690. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 06:41:00,484][00034] Avg episode reward: [(0, '-4.103')] [2024-08-05 06:41:02,429][00139] DAMAGECOUNT value on done: 29232.0 [2024-08-05 06:41:02,429][00139] Sum rewards: -2.464, reward structure: {'DEATHCOUNT': '-11.250', 'HEALTH': '-1.435', 'AMMO2': '0.003', 'AMMO4': '0.012', 'AMMO5': '0.024', 'weapon4': '0.036', 'WEAPON4': '0.050', 'weapon5': '0.136', 'AMMO3': '0.172', 'HITCOUNT': '0.220', 'WEAPON5': '0.400', 'ARMOR': '0.424', 'WEAPON3': '0.900', 'DAMAGECOUNT': '1.254', 'weapon2': '1.400', 'weapon3': '1.690', 'FRAGCOUNT': '3.500'} [2024-08-05 06:41:02,654][00139] DAMAGECOUNT value on done: 29613.0 [2024-08-05 06:41:02,655][00139] Sum rewards: -6.125, reward structure: {'DEATHCOUNT': '-12.000', 'HEALTH': '-2.140', 'weapon5': '0.006', 'weapon4': '0.008', 'WEAPON1': '0.010', 'AMMO2': '0.019', 'AMMO5': '0.020', 'ARMOR': '0.024', 'AMMO4': '0.095', 'WEAPON4': '0.100', 'HITCOUNT': '0.190', 'AMMO3': '0.219', 'WEAPON5': '0.250', 'DAMAGECOUNT': '0.675', 'WEAPON3': '1.150', 'weapon2': '1.498', 'weapon3': '1.750', 'FRAGCOUNT': '2.000'} [2024-08-05 06:41:05,483][00034] Fps is (10 sec: 1638.4, 60 sec: 1228.8, 300 sec: 1138.5). Total num frames: 3948544. Throughput: 0: 287.7. Samples: 987410. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 06:41:05,484][00034] Avg episode reward: [(0, '-4.066')] [2024-08-05 06:41:05,492][00132] Saving new best policy, reward=-4.066! [2024-08-05 06:41:10,483][00034] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1138.5). Total num frames: 3948544. Throughput: 0: 287.1. Samples: 989077. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 06:41:10,484][00034] Avg episode reward: [(0, '-4.066')] [2024-08-05 06:41:15,483][00034] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1138.5). Total num frames: 3956736. Throughput: 0: 286.9. Samples: 989927. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 06:41:15,484][00034] Avg episode reward: [(0, '-4.066')] [2024-08-05 06:41:17,299][00139] DAMAGECOUNT value on done: 29427.0 [2024-08-05 06:41:17,299][00139] Sum rewards: -0.835, reward structure: {'DEATHCOUNT': '-6.750', 'HEALTH': '-1.405', 'AMMO2': '0.015', 'WEAPON1': '0.020', 'weapon5': '0.022', 'AMMO5': '0.028', 'weapon4': '0.044', 'ARMOR': '0.064', 'AMMO4': '0.076', 'AMMO3': '0.086', 'HITCOUNT': '0.140', 'WEAPON4': '0.150', 'WEAPON5': '0.350', 'DAMAGECOUNT': '0.585', 'WEAPON3': '0.600', 'weapon3': '1.550', 'weapon2': '1.590', 'FRAGCOUNT': '2.000'} [2024-08-05 06:41:17,527][00139] DAMAGECOUNT value on done: 29877.0 [2024-08-05 06:41:17,527][00139] Sum rewards: -2.546, reward structure: {'DEATHCOUNT': '-9.000', 'HEALTH': '-1.107', 'AMMO2': '0.001', 'AMMO4': '0.007', 'WEAPON1': '0.010', 'AMMO5': '0.012', 'weapon5': '0.018', 'ARMOR': '0.029', 'weapon7': '0.078', 'AMMO6': '0.100', 'AMMO7': '0.100', 'WEAPON7': '0.100', 'AMMO3': '0.148', 'WEAPON5': '0.250', 'HITCOUNT': '0.260', 'DAMAGECOUNT': '0.792', 'WEAPON3': '0.900', 'weapon2': '1.324', 'FRAGCOUNT': '1.500', 'weapon3': '1.930'} [2024-08-05 06:41:20,483][00034] Fps is (10 sec: 1638.4, 60 sec: 1228.8, 300 sec: 1138.5). Total num frames: 3964928. Throughput: 0: 286.9. Samples: 991634. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 06:41:20,484][00034] Avg episode reward: [(0, '-4.017')] [2024-08-05 06:41:20,487][00132] Saving new best policy, reward=-4.017! [2024-08-05 06:41:25,483][00034] Fps is (10 sec: 1638.4, 60 sec: 1228.8, 300 sec: 1166.3). Total num frames: 3973120. Throughput: 0: 287.7. Samples: 993368. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 06:41:25,484][00034] Avg episode reward: [(0, '-4.017')] [2024-08-05 06:41:30,483][00034] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1138.5). Total num frames: 3973120. Throughput: 0: 288.7. Samples: 994282. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 06:41:30,485][00034] Avg episode reward: [(0, '-4.017')] [2024-08-05 06:41:31,773][00139] DAMAGECOUNT value on done: 29662.0 [2024-08-05 06:41:31,774][00139] Sum rewards: -5.692, reward structure: {'DEATHCOUNT': '-7.500', 'FRAGCOUNT': '-2.000', 'HEALTH': '-1.532', 'AMMO4': '-0.007', 'AMMO2': '-0.001', 'AMMO5': '0.013', 'WEAPON1': '0.020', 'WEAPON4': '0.100', 'weapon5': '0.128', 'WEAPON5': '0.150', 'AMMO3': '0.160', 'weapon4': '0.174', 'HITCOUNT': '0.200', 'DAMAGECOUNT': '0.705', 'WEAPON3': '0.900', 'weapon2': '1.146', 'weapon3': '1.652'} [2024-08-05 06:41:31,988][00139] DAMAGECOUNT value on done: 29968.0 [2024-08-05 06:41:31,988][00139] Sum rewards: -2.910, reward structure: {'DEATHCOUNT': '-8.250', 'HEALTH': '-0.510', 'AMMO2': '0.007', 'WEAPON1': '0.010', 'AMMO5': '0.015', 'ARMOR': '0.020', 'AMMO4': '0.034', 'HITCOUNT': '0.100', 'AMMO3': '0.135', 'WEAPON5': '0.150', 'DAMAGECOUNT': '0.273', 'WEAPON3': '0.650', 'FRAGCOUNT': '1.000', 'weapon2': '1.708', 'weapon3': '1.748'} [2024-08-05 06:41:35,483][00034] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1138.5). Total num frames: 3981312. Throughput: 0: 289.0. Samples: 996056. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 06:41:35,484][00034] Avg episode reward: [(0, '-3.972')] [2024-08-05 06:41:35,494][00132] Saving /kaggle/working/train_dir/default_experiment/checkpoint_p0/checkpoint_000000486_3981312.pth... [2024-08-05 06:41:35,571][00132] Removing /kaggle/working/train_dir/default_experiment/checkpoint_p0/checkpoint_000000453_3710976.pth [2024-08-05 06:41:35,581][00132] Saving new best policy, reward=-3.972! [2024-08-05 06:41:40,483][00034] Fps is (10 sec: 1638.4, 60 sec: 1228.8, 300 sec: 1138.5). Total num frames: 3989504. Throughput: 0: 286.5. Samples: 997723. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 06:41:40,484][00034] Avg episode reward: [(0, '-3.972')] [2024-08-05 06:41:45,483][00034] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1138.5). Total num frames: 3989504. Throughput: 0: 287.1. Samples: 998609. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 06:41:45,485][00034] Avg episode reward: [(0, '-3.972')] [2024-08-05 06:41:46,303][00139] DAMAGECOUNT value on done: 29885.0 [2024-08-05 06:41:46,303][00139] Sum rewards: 2.322, reward structure: {'DEATHCOUNT': '-3.750', 'HEALTH': '-1.014', 'AMMO2': '0.007', 'AMMO5': '0.007', 'WEAPON1': '0.010', 'weapon4': '0.016', 'AMMO4': '0.034', 'HITCOUNT': '0.060', 'WEAPON4': '0.100', 'AMMO3': '0.123', 'weapon5': '0.124', 'WEAPON5': '0.150', 'ARMOR': '0.400', 'WEAPON3': '0.550', 'DAMAGECOUNT': '0.669', 'weapon2': '1.222', 'weapon3': '1.614', 'FRAGCOUNT': '2.000'} [2024-08-05 06:41:46,532][00139] DAMAGECOUNT value on done: 30175.0 [2024-08-05 06:41:46,532][00139] Sum rewards: -5.278, reward structure: {'DEATHCOUNT': '-11.250', 'HEALTH': '-0.707', 'AMMO2': '0.012', 'AMMO5': '0.022', 'AMMO4': '0.062', 'weapon5': '0.062', 'WEAPON4': '0.100', 'weapon4': '0.110', 'AMMO3': '0.158', 'HITCOUNT': '0.160', 'WEAPON5': '0.400', 'DAMAGECOUNT': '0.621', 'WEAPON3': '0.850', 'FRAGCOUNT': '1.000', 'weapon2': '1.180', 'weapon3': '1.942'} [2024-08-05 06:41:50,483][00034] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1138.5). Total num frames: 3997696. Throughput: 0: 287.5. Samples: 1000349. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 06:41:50,484][00034] Avg episode reward: [(0, '-3.805')] [2024-08-05 06:41:50,486][00132] Saving new best policy, reward=-3.805! [2024-08-05 06:41:55,483][00034] Fps is (10 sec: 1638.4, 60 sec: 1228.8, 300 sec: 1138.5). Total num frames: 4005888. Throughput: 0: 289.6. Samples: 1002107. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 06:41:55,484][00034] Avg episode reward: [(0, '-3.805')] [2024-08-05 06:42:00,228][00138] Updated weights for policy 0, policy_version 490 (0.0017) [2024-08-05 06:42:00,483][00034] Fps is (10 sec: 1638.4, 60 sec: 1228.8, 300 sec: 1166.3). Total num frames: 4014080. Throughput: 0: 289.6. Samples: 1002959. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 06:42:00,484][00034] Avg episode reward: [(0, '-3.805')] [2024-08-05 06:42:00,912][00139] DAMAGECOUNT value on done: 30085.0 [2024-08-05 06:42:00,913][00139] Sum rewards: -2.331, reward structure: {'DEATHCOUNT': '-9.000', 'HEALTH': '-1.044', 'weapon5': '0.006', 'AMMO5': '0.010', 'AMMO2': '0.037', 'ARMOR': '0.040', 'AMMO3': '0.119', 'WEAPON5': '0.150', 'HITCOUNT': '0.180', 'weapon4': '0.184', 'AMMO4': '0.187', 'WEAPON4': '0.350', 'DAMAGECOUNT': '0.600', 'WEAPON3': '0.700', 'weapon3': '1.434', 'weapon2': '1.716', 'FRAGCOUNT': '2.000'} [2024-08-05 06:42:01,133][00139] DAMAGECOUNT value on done: 30590.0 [2024-08-05 06:42:01,134][00139] Sum rewards: 1.493, reward structure: {'DEATHCOUNT': '-6.000', 'HEALTH': '-0.422', 'AMMO5': '0.015', 'weapon4': '0.020', 'AMMO2': '0.023', 'AMMO3': '0.067', 'WEAPON4': '0.100', 'AMMO4': '0.112', 'weapon5': '0.206', 'WEAPON5': '0.250', 'HITCOUNT': '0.310', 'WEAPON3': '0.500', 'DAMAGECOUNT': '1.245', 'weapon3': '1.524', 'weapon2': '1.542', 'FRAGCOUNT': '2.000'} [2024-08-05 06:42:05,483][00034] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1138.5). Total num frames: 4014080. Throughput: 0: 290.0. Samples: 1004682. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 06:42:05,484][00034] Avg episode reward: [(0, '-3.771')] [2024-08-05 06:42:05,492][00132] Saving new best policy, reward=-3.771! [2024-08-05 06:42:10,483][00034] Fps is (10 sec: 819.2, 60 sec: 1228.8, 300 sec: 1138.5). Total num frames: 4022272. Throughput: 0: 286.7. Samples: 1006268. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 06:42:10,484][00034] Avg episode reward: [(0, '-3.771')] [2024-08-05 06:42:15,483][00034] Fps is (10 sec: 1638.4, 60 sec: 1228.8, 300 sec: 1166.3). Total num frames: 4030464. Throughput: 0: 286.1. Samples: 1007156. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 06:42:15,485][00034] Avg episode reward: [(0, '-3.771')] [2024-08-05 06:42:15,836][00139] DAMAGECOUNT value on done: 30304.0 [2024-08-05 06:42:15,837][00139] Sum rewards: -1.094, reward structure: {'DEATHCOUNT': '-8.250', 'HEALTH': '-0.805', 'weapon5': '0.006', 'AMMO2': '0.007', 'AMMO5': '0.013', 'AMMO4': '0.036', 'WEAPON4': '0.050', 'weapon7': '0.068', 'weapon4': '0.092', 'AMMO6': '0.100', 'AMMO7': '0.100', 'WEAPON7': '0.100', 'HITCOUNT': '0.110', 'AMMO3': '0.129', 'WEAPON5': '0.150', 'ARMOR': '0.509', 'WEAPON3': '0.600', 'DAMAGECOUNT': '0.657', 'weapon2': '1.276', 'weapon3': '1.958', 'FRAGCOUNT': '2.000'} [2024-08-05 06:42:16,061][00139] DAMAGECOUNT value on done: 30950.0 [2024-08-05 06:42:16,061][00139] Sum rewards: -9.185, reward structure: {'DEATHCOUNT': '-15.000', 'HEALTH': '-2.790', 'AMMO2': '0.007', 'weapon5': '0.014', 'AMMO5': '0.025', 'weapon4': '0.028', 'AMMO4': '0.035', 'WEAPON4': '0.100', 'AMMO3': '0.219', 'HITCOUNT': '0.310', 'WEAPON5': '0.400', 'weapon2': '1.048', 'DAMAGECOUNT': '1.080', 'WEAPON3': '1.300', 'FRAGCOUNT': '2.000', 'weapon3': '2.038'} [2024-08-05 06:42:20,483][00034] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1138.5). Total num frames: 4030464. Throughput: 0: 285.6. Samples: 1008906. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 06:42:20,485][00034] Avg episode reward: [(0, '-3.801')] [2024-08-05 06:42:25,483][00034] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1138.5). Total num frames: 4038656. Throughput: 0: 286.6. Samples: 1010619. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 06:42:25,484][00034] Avg episode reward: [(0, '-3.801')] [2024-08-05 06:42:30,320][00139] DAMAGECOUNT value on done: 30399.0 [2024-08-05 06:42:30,320][00139] Sum rewards: -4.293, reward structure: {'DEATHCOUNT': '-9.000', 'HEALTH': '-1.460', 'AMMO2': '0.011', 'AMMO4': '0.056', 'HITCOUNT': '0.080', 'WEAPON4': '0.100', 'ARMOR': '0.132', 'AMMO3': '0.149', 'DAMAGECOUNT': '0.285', 'WEAPON3': '0.900', 'FRAGCOUNT': '1.000', 'weapon2': '1.312', 'weapon3': '2.142'} [2024-08-05 06:42:30,483][00034] Fps is (10 sec: 1638.4, 60 sec: 1228.8, 300 sec: 1138.5). Total num frames: 4046848. Throughput: 0: 286.6. Samples: 1011506. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 06:42:30,484][00034] Avg episode reward: [(0, '-3.827')] [2024-08-05 06:42:30,545][00139] DAMAGECOUNT value on done: 31243.0 [2024-08-05 06:42:30,545][00139] Sum rewards: -0.086, reward structure: {'DEATHCOUNT': '-6.750', 'HEALTH': '-0.702', 'AMMO2': '0.006', 'AMMO5': '0.020', 'AMMO4': '0.030', 'weapon4': '0.042', 'weapon5': '0.048', 'WEAPON4': '0.050', 'AMMO3': '0.147', 'HITCOUNT': '0.230', 'WEAPON5': '0.300', 'WEAPON3': '0.750', 'ARMOR': '0.844', 'weapon2': '0.870', 'DAMAGECOUNT': '0.879', 'FRAGCOUNT': '1.000', 'weapon3': '2.150'} [2024-08-05 06:42:35,483][00034] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1138.5). Total num frames: 4046848. Throughput: 0: 286.2. Samples: 1013227. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 06:42:35,485][00034] Avg episode reward: [(0, '-3.790')] [2024-08-05 06:42:40,483][00034] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1138.5). Total num frames: 4055040. Throughput: 0: 284.8. Samples: 1014925. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 06:42:40,484][00034] Avg episode reward: [(0, '-3.790')] [2024-08-05 06:42:45,205][00139] DAMAGECOUNT value on done: 30504.0 [2024-08-05 06:42:45,205][00139] Sum rewards: -5.276, reward structure: {'DEATHCOUNT': '-12.750', 'HEALTH': '-0.884', 'ARMOR': '0.004', 'AMMO5': '0.007', 'WEAPON1': '0.010', 'weapon5': '0.018', 'AMMO2': '0.021', 'HITCOUNT': '0.080', 'weapon4': '0.084', 'AMMO4': '0.102', 'WEAPON4': '0.150', 'WEAPON5': '0.150', 'AMMO3': '0.167', 'DAMAGECOUNT': '0.315', 'WEAPON3': '1.000', 'weapon2': '1.540', 'weapon3': '1.710', 'FRAGCOUNT': '3.000'} [2024-08-05 06:42:45,435][00139] DAMAGECOUNT value on done: 31385.0 [2024-08-05 06:42:45,435][00139] Sum rewards: -2.456, reward structure: {'DEATHCOUNT': '-7.500', 'HEALTH': '-1.142', 'AMMO2': '0.003', 'AMMO4': '0.015', 'AMMO5': '0.015', 'weapon4': '0.092', 'WEAPON4': '0.100', 'HITCOUNT': '0.140', 'AMMO3': '0.145', 'WEAPON5': '0.250', 'DAMAGECOUNT': '0.426', 'ARMOR': '0.482', 'WEAPON3': '0.700', 'FRAGCOUNT': '1.000', 'weapon3': '1.314', 'weapon2': '1.504'} [2024-08-05 06:42:45,483][00034] Fps is (10 sec: 1638.4, 60 sec: 1228.8, 300 sec: 1138.5). Total num frames: 4063232. Throughput: 0: 284.4. Samples: 1015756. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 06:42:45,484][00034] Avg episode reward: [(0, '-3.751')] [2024-08-05 06:42:45,492][00132] Saving new best policy, reward=-3.751! [2024-08-05 06:42:50,483][00034] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1138.5). Total num frames: 4063232. Throughput: 0: 284.1. Samples: 1017465. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 06:42:50,484][00034] Avg episode reward: [(0, '-3.751')] [2024-08-05 06:42:55,483][00034] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1138.5). Total num frames: 4071424. Throughput: 0: 287.7. Samples: 1019215. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 06:42:55,486][00034] Avg episode reward: [(0, '-3.751')] [2024-08-05 06:42:59,746][00139] DAMAGECOUNT value on done: 30809.0 [2024-08-05 06:42:59,746][00139] Sum rewards: -2.017, reward structure: {'DEATHCOUNT': '-9.750', 'HEALTH': '-2.965', 'AMMO4': '-0.026', 'AMMO2': '-0.005', 'weapon7': '0.008', 'WEAPON1': '0.020', 'weapon5': '0.024', 'AMMO5': '0.030', 'AMMO6': '0.100', 'WEAPON7': '0.100', 'AMMO7': '0.100', 'HITCOUNT': '0.180', 'AMMO3': '0.200', 'WEAPON5': '0.500', 'DAMAGECOUNT': '0.915', 'WEAPON3': '1.150', 'weapon3': '1.640', 'weapon2': '1.762', 'FRAGCOUNT': '4.000'} [2024-08-05 06:42:59,965][00139] DAMAGECOUNT value on done: 31584.0 [2024-08-05 06:42:59,965][00139] Sum rewards: -1.267, reward structure: {'DEATHCOUNT': '-7.500', 'HEALTH': '-0.274', 'AMMO5': '0.015', 'AMMO2': '0.025', 'ARMOR': '0.067', 'WEAPON4': '0.100', 'AMMO3': '0.112', 'AMMO4': '0.123', 'HITCOUNT': '0.190', 'WEAPON5': '0.250', 'DAMAGECOUNT': '0.597', 'WEAPON3': '0.650', 'FRAGCOUNT': '1.000', 'weapon3': '1.542', 'weapon2': '1.836'} [2024-08-05 06:43:00,483][00034] Fps is (10 sec: 1638.4, 60 sec: 1092.3, 300 sec: 1138.5). Total num frames: 4079616. Throughput: 0: 286.9. Samples: 1020066. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 06:43:00,484][00034] Avg episode reward: [(0, '-3.698')] [2024-08-05 06:43:00,485][00132] Saving new best policy, reward=-3.698! [2024-08-05 06:43:05,483][00034] Fps is (10 sec: 1638.4, 60 sec: 1228.8, 300 sec: 1166.3). Total num frames: 4087808. Throughput: 0: 287.2. Samples: 1021828. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 06:43:05,484][00034] Avg episode reward: [(0, '-3.698')] [2024-08-05 06:43:10,483][00034] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1138.5). Total num frames: 4087808. Throughput: 0: 286.9. Samples: 1023528. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 06:43:10,484][00034] Avg episode reward: [(0, '-3.698')] [2024-08-05 06:43:11,860][00138] Updated weights for policy 0, policy_version 500 (0.0018) [2024-08-05 06:43:14,495][00139] DAMAGECOUNT value on done: 30944.0 [2024-08-05 06:43:14,496][00139] Sum rewards: -2.378, reward structure: {'DEATHCOUNT': '-8.250', 'HEALTH': '-0.808', 'weapon4': '0.010', 'weapon7': '0.012', 'AMMO2': '0.018', 'WEAPON1': '0.020', 'AMMO5': '0.022', 'ARMOR': '0.040', 'WEAPON4': '0.050', 'weapon5': '0.054', 'HITCOUNT': '0.080', 'AMMO4': '0.089', 'AMMO3': '0.172', 'AMMO6': '0.200', 'WEAPON7': '0.200', 'AMMO7': '0.200', 'WEAPON5': '0.350', 'DAMAGECOUNT': '0.405', 'WEAPON3': '0.650', 'FRAGCOUNT': '1.000', 'weapon2': '1.364', 'weapon3': '1.744'} [2024-08-05 06:43:14,719][00139] DAMAGECOUNT value on done: 31724.0 [2024-08-05 06:43:14,720][00139] Sum rewards: -5.268, reward structure: {'DEATHCOUNT': '-10.500', 'HEALTH': '-1.600', 'AMMO5': '0.010', 'AMMO2': '0.030', 'ARMOR': '0.048', 'weapon4': '0.096', 'weapon5': '0.104', 'HITCOUNT': '0.110', 'AMMO4': '0.151', 'WEAPON4': '0.200', 'WEAPON5': '0.200', 'AMMO3': '0.222', 'DAMAGECOUNT': '0.420', 'weapon2': '0.972', 'FRAGCOUNT': '1.000', 'WEAPON3': '1.150', 'weapon3': '2.118'} [2024-08-05 06:43:15,483][00034] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1138.5). Total num frames: 4096000. Throughput: 0: 285.4. Samples: 1024347. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 06:43:15,485][00034] Avg episode reward: [(0, '-3.674')] [2024-08-05 06:43:15,492][00132] Saving new best policy, reward=-3.674! [2024-08-05 06:43:20,483][00034] Fps is (10 sec: 1638.4, 60 sec: 1228.8, 300 sec: 1138.5). Total num frames: 4104192. Throughput: 0: 284.2. Samples: 1026016. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 06:43:20,485][00034] Avg episode reward: [(0, '-3.674')] [2024-08-05 06:43:25,483][00034] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1138.5). Total num frames: 4104192. Throughput: 0: 284.2. Samples: 1027716. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 06:43:25,484][00034] Avg episode reward: [(0, '-3.674')] [2024-08-05 06:43:29,296][00139] DAMAGECOUNT value on done: 31089.0 [2024-08-05 06:43:29,520][00139] DAMAGECOUNT value on done: 31929.0 [2024-08-05 06:43:29,521][00139] Sum rewards: -9.328, reward structure: {'DEATHCOUNT': '-12.750', 'HEALTH': '-1.890', 'FRAGCOUNT': '-0.500', 'AMMO2': '0.002', 'weapon5': '0.006', 'AMMO4': '0.010', 'AMMO5': '0.015', 'weapon7': '0.068', 'WEAPON4': '0.100', 'HITCOUNT': '0.110', 'AMMO6': '0.120', 'AMMO7': '0.120', 'weapon4': '0.150', 'WEAPON7': '0.200', 'AMMO3': '0.202', 'WEAPON5': '0.300', 'DAMAGECOUNT': '0.615', 'WEAPON3': '1.000', 'weapon2': '1.110', 'weapon3': '1.684'} [2024-08-05 06:43:30,483][00034] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1138.5). Total num frames: 4112384. Throughput: 0: 285.3. Samples: 1028596. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 06:43:30,484][00034] Avg episode reward: [(0, '-3.793')] [2024-08-05 06:43:35,483][00034] Fps is (10 sec: 1638.4, 60 sec: 1228.8, 300 sec: 1138.5). Total num frames: 4120576. Throughput: 0: 285.9. Samples: 1030331. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 06:43:35,484][00034] Avg episode reward: [(0, '-3.793')] [2024-08-05 06:43:35,492][00132] Saving /kaggle/working/train_dir/default_experiment/checkpoint_p0/checkpoint_000000503_4120576.pth... [2024-08-05 06:43:35,563][00132] Removing /kaggle/working/train_dir/default_experiment/checkpoint_p0/checkpoint_000000469_3842048.pth [2024-08-05 06:43:40,483][00034] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1138.5). Total num frames: 4120576. Throughput: 0: 286.0. Samples: 1032085. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 06:43:40,484][00034] Avg episode reward: [(0, '-3.793')] [2024-08-05 06:43:43,936][00139] DAMAGECOUNT value on done: 31497.0 [2024-08-05 06:43:43,937][00139] Sum rewards: -0.208, reward structure: {'DEATHCOUNT': '-9.750', 'HEALTH': '-0.880', 'AMMO2': '0.014', 'AMMO5': '0.020', 'WEAPON1': '0.020', 'weapon5': '0.030', 'WEAPON4': '0.050', 'weapon4': '0.066', 'AMMO4': '0.069', 'weapon7': '0.096', 'AMMO6': '0.100', 'AMMO7': '0.100', 'WEAPON7': '0.100', 'AMMO3': '0.155', 'HITCOUNT': '0.320', 'WEAPON5': '0.350', 'weapon2': '0.616', 'WEAPON3': '0.900', 'DAMAGECOUNT': '1.224', 'weapon3': '2.192', 'FRAGCOUNT': '4.000'} [2024-08-05 06:43:44,230][00139] DAMAGECOUNT value on done: 32134.0 [2024-08-05 06:43:44,231][00139] Sum rewards: -4.956, reward structure: {'DEATHCOUNT': '-12.000', 'HEALTH': '-1.926', 'AMMO5': '0.006', 'WEAPON1': '0.020', 'AMMO2': '0.020', 'weapon5': '0.046', 'ARMOR': '0.060', 'AMMO4': '0.101', 'AMMO3': '0.150', 'WEAPON5': '0.150', 'HITCOUNT': '0.200', 'weapon4': '0.228', 'WEAPON4': '0.300', 'DAMAGECOUNT': '0.615', 'WEAPON3': '0.900', 'weapon2': '1.486', 'weapon3': '1.688', 'FRAGCOUNT': '3.000'} [2024-08-05 06:43:45,483][00034] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1138.5). Total num frames: 4128768. Throughput: 0: 286.1. Samples: 1032941. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 06:43:45,485][00034] Avg episode reward: [(0, '-3.785')] [2024-08-05 06:43:50,483][00034] Fps is (10 sec: 1638.4, 60 sec: 1228.8, 300 sec: 1138.5). Total num frames: 4136960. Throughput: 0: 284.4. Samples: 1034627. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 06:43:50,485][00034] Avg episode reward: [(0, '-3.785')] [2024-08-05 06:43:55,483][00034] Fps is (10 sec: 1638.4, 60 sec: 1228.8, 300 sec: 1166.3). Total num frames: 4145152. Throughput: 0: 284.9. Samples: 1036350. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 06:43:55,484][00034] Avg episode reward: [(0, '-3.785')] [2024-08-05 06:43:58,546][00139] DAMAGECOUNT value on done: 31787.0 [2024-08-05 06:43:58,547][00139] Sum rewards: -2.666, reward structure: {'DEATHCOUNT': '-10.500', 'HEALTH': '-0.640', 'weapon5': '0.004', 'AMMO2': '0.007', 'AMMO5': '0.020', 'AMMO4': '0.037', 'AMMO3': '0.132', 'HITCOUNT': '0.250', 'WEAPON5': '0.300', 'WEAPON3': '0.650', 'DAMAGECOUNT': '0.870', 'weapon2': '1.546', 'weapon3': '1.658', 'FRAGCOUNT': '3.000'} [2024-08-05 06:43:58,783][00139] DAMAGECOUNT value on done: 32229.0 [2024-08-05 06:43:58,783][00139] Sum rewards: -3.610, reward structure: {'DEATHCOUNT': '-7.500', 'HEALTH': '-1.700', 'ARMOR': '0.004', 'AMMO2': '0.011', 'AMMO5': '0.014', 'WEAPON1': '0.020', 'weapon5': '0.026', 'AMMO4': '0.055', 'weapon7': '0.074', 'AMMO3': '0.082', 'AMMO6': '0.100', 'AMMO7': '0.100', 'WEAPON7': '0.100', 'HITCOUNT': '0.120', 'WEAPON4': '0.150', 'WEAPON5': '0.200', 'weapon4': '0.250', 'DAMAGECOUNT': '0.285', 'WEAPON3': '0.550', 'FRAGCOUNT': '1.000', 'weapon3': '1.216', 'weapon2': '1.232'} [2024-08-05 06:44:00,483][00034] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1138.5). Total num frames: 4145152. Throughput: 0: 286.1. Samples: 1037220. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 06:44:00,485][00034] Avg episode reward: [(0, '-3.747')] [2024-08-05 06:44:05,483][00034] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1138.5). Total num frames: 4153344. Throughput: 0: 286.9. Samples: 1038926. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 06:44:05,485][00034] Avg episode reward: [(0, '-3.747')] [2024-08-05 06:44:10,483][00034] Fps is (10 sec: 1638.4, 60 sec: 1228.8, 300 sec: 1166.3). Total num frames: 4161536. Throughput: 0: 287.5. Samples: 1040653. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 06:44:10,485][00034] Avg episode reward: [(0, '-3.747')] [2024-08-05 06:44:13,031][00139] DAMAGECOUNT value on done: 32197.0 [2024-08-05 06:44:13,031][00139] Sum rewards: -5.108, reward structure: {'DEATHCOUNT': '-13.500', 'HEALTH': '-1.334', 'ARMOR': '0.013', 'AMMO5': '0.020', 'AMMO2': '0.023', 'weapon4': '0.066', 'weapon5': '0.070', 'AMMO4': '0.113', 'AMMO3': '0.162', 'WEAPON4': '0.200', 'HITCOUNT': '0.300', 'WEAPON5': '0.350', 'WEAPON3': '0.900', 'DAMAGECOUNT': '1.230', 'weapon2': '1.362', 'weapon3': '1.916', 'FRAGCOUNT': '3.000'} [2024-08-05 06:44:13,247][00139] DAMAGECOUNT value on done: 32665.0 [2024-08-05 06:44:13,247][00139] Sum rewards: 1.922, reward structure: {'DEATHCOUNT': '-9.000', 'HEALTH': '-0.552', 'WEAPON1': '0.010', 'AMMO5': '0.012', 'AMMO2': '0.013', 'weapon5': '0.054', 'ARMOR': '0.056', 'weapon4': '0.066', 'AMMO4': '0.067', 'WEAPON4': '0.100', 'AMMO3': '0.128', 'WEAPON5': '0.250', 'HITCOUNT': '0.350', 'WEAPON3': '0.650', 'DAMAGECOUNT': '1.308', 'weapon2': '1.578', 'weapon3': '1.832', 'FRAGCOUNT': '5.000'} [2024-08-05 06:44:15,483][00034] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1138.5). Total num frames: 4161536. Throughput: 0: 288.2. Samples: 1041565. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 06:44:15,485][00034] Avg episode reward: [(0, '-3.702')] [2024-08-05 06:44:20,483][00034] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1138.5). Total num frames: 4169728. Throughput: 0: 286.7. Samples: 1043232. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 06:44:20,485][00034] Avg episode reward: [(0, '-3.702')] [2024-08-05 06:44:23,437][00138] Updated weights for policy 0, policy_version 510 (0.0017) [2024-08-05 06:44:25,483][00034] Fps is (10 sec: 1638.4, 60 sec: 1228.8, 300 sec: 1138.5). Total num frames: 4177920. Throughput: 0: 286.6. Samples: 1044980. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 06:44:25,485][00034] Avg episode reward: [(0, '-3.702')] [2024-08-05 06:44:27,689][00139] DAMAGECOUNT value on done: 32399.0 [2024-08-05 06:44:27,689][00139] Sum rewards: -2.482, reward structure: {'DEATHCOUNT': '-8.250', 'HEALTH': '-2.066', 'AMMO4': '-0.042', 'AMMO2': '-0.008', 'weapon4': '0.006', 'AMMO5': '0.015', 'WEAPON1': '0.020', 'WEAPON4': '0.050', 'weapon5': '0.064', 'AMMO3': '0.123', 'weapon7': '0.130', 'HITCOUNT': '0.180', 'WEAPON7': '0.200', 'AMMO6': '0.200', 'AMMO7': '0.200', 'WEAPON5': '0.350', 'DAMAGECOUNT': '0.606', 'WEAPON3': '0.700', 'ARMOR': '0.800', 'FRAGCOUNT': '1.000', 'weapon2': '1.524', 'weapon3': '1.716'} [2024-08-05 06:44:27,916][00139] DAMAGECOUNT value on done: 33339.0 [2024-08-05 06:44:27,917][00139] Sum rewards: 2.050, reward structure: {'DEATHCOUNT': '-9.750', 'HEALTH': '-0.206', 'AMMO5': '0.007', 'AMMO2': '0.027', 'ARMOR': '0.064', 'weapon5': '0.098', 'WEAPON5': '0.100', 'AMMO3': '0.116', 'AMMO4': '0.135', 'weapon4': '0.158', 'WEAPON4': '0.250', 'HITCOUNT': '0.300', 'WEAPON3': '0.650', 'weapon2': '1.504', 'weapon3': '1.580', 'DAMAGECOUNT': '2.016', 'FRAGCOUNT': '5.000'} [2024-08-05 06:44:30,483][00034] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1138.5). Total num frames: 4177920. Throughput: 0: 287.1. Samples: 1045861. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 06:44:30,485][00034] Avg episode reward: [(0, '-3.619')] [2024-08-05 06:44:30,545][00132] Saving new best policy, reward=-3.619! [2024-08-05 06:44:35,483][00034] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1138.5). Total num frames: 4186112. Throughput: 0: 288.3. Samples: 1047600. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 06:44:35,484][00034] Avg episode reward: [(0, '-3.619')] [2024-08-05 06:44:40,483][00034] Fps is (10 sec: 1638.4, 60 sec: 1228.8, 300 sec: 1138.5). Total num frames: 4194304. Throughput: 0: 288.3. Samples: 1049322. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 06:44:40,484][00034] Avg episode reward: [(0, '-3.619')] [2024-08-05 06:44:42,296][00139] DAMAGECOUNT value on done: 32709.0 [2024-08-05 06:44:42,296][00139] Sum rewards: -2.397, reward structure: {'DEATHCOUNT': '-10.500', 'HEALTH': '-0.496', 'AMMO5': '0.003', 'AMMO2': '0.016', 'weapon4': '0.026', 'ARMOR': '0.032', 'WEAPON5': '0.050', 'AMMO4': '0.082', 'AMMO3': '0.094', 'WEAPON4': '0.150', 'HITCOUNT': '0.240', 'WEAPON3': '0.400', 'DAMAGECOUNT': '0.930', 'weapon3': '1.658', 'weapon2': '1.918', 'FRAGCOUNT': '3.000'} [2024-08-05 06:44:42,528][00139] DAMAGECOUNT value on done: 33384.0 [2024-08-05 06:44:45,483][00034] Fps is (10 sec: 1638.4, 60 sec: 1228.8, 300 sec: 1166.3). Total num frames: 4202496. Throughput: 0: 288.0. Samples: 1050182. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 06:44:45,484][00034] Avg episode reward: [(0, '-3.615')] [2024-08-05 06:44:45,492][00132] Saving new best policy, reward=-3.615! [2024-08-05 06:44:50,483][00034] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1138.5). Total num frames: 4202496. Throughput: 0: 286.6. Samples: 1051823. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 06:44:50,485][00034] Avg episode reward: [(0, '-3.615')] [2024-08-05 06:44:55,483][00034] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1138.5). Total num frames: 4210688. Throughput: 0: 285.5. Samples: 1053502. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 06:44:55,486][00034] Avg episode reward: [(0, '-3.615')] [2024-08-05 06:44:57,395][00139] DAMAGECOUNT value on done: 32944.0 [2024-08-05 06:44:57,395][00139] Sum rewards: -3.427, reward structure: {'DEATHCOUNT': '-8.250', 'HEALTH': '-1.357', 'AMMO2': '0.004', 'AMMO5': '0.014', 'AMMO4': '0.018', 'WEAPON1': '0.020', 'ARMOR': '0.028', 'weapon5': '0.048', 'HITCOUNT': '0.080', 'AMMO3': '0.131', 'AMMO6': '0.200', 'WEAPON7': '0.200', 'AMMO7': '0.200', 'WEAPON5': '0.200', 'FRAGCOUNT': '0.500', 'WEAPON3': '0.650', 'DAMAGECOUNT': '0.705', 'weapon2': '1.432', 'weapon3': '1.750'} [2024-08-05 06:44:57,619][00139] DAMAGECOUNT value on done: 33519.0 [2024-08-05 06:44:57,619][00139] Sum rewards: -6.417, reward structure: {'DEATHCOUNT': '-9.750', 'FRAGCOUNT': '-1.500', 'HEALTH': '-1.072', 'AMMO5': '0.008', 'AMMO2': '0.027', 'WEAPON1': '0.030', 'weapon5': '0.058', 'weapon4': '0.094', 'AMMO3': '0.121', 'HITCOUNT': '0.130', 'AMMO4': '0.132', 'WEAPON5': '0.150', 'WEAPON4': '0.200', 'DAMAGECOUNT': '0.405', 'ARMOR': '0.565', 'WEAPON3': '0.650', 'weapon3': '1.450', 'weapon2': '1.884'} [2024-08-05 06:45:00,483][00034] Fps is (10 sec: 1638.4, 60 sec: 1228.8, 300 sec: 1166.3). Total num frames: 4218880. Throughput: 0: 283.9. Samples: 1054340. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 06:45:00,484][00034] Avg episode reward: [(0, '-3.626')] [2024-08-05 06:45:05,483][00034] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1138.5). Total num frames: 4218880. Throughput: 0: 284.7. Samples: 1056042. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 06:45:05,484][00034] Avg episode reward: [(0, '-3.626')] [2024-08-05 06:45:10,483][00034] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1138.5). Total num frames: 4227072. Throughput: 0: 284.1. Samples: 1057764. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 06:45:10,485][00034] Avg episode reward: [(0, '-3.626')] [2024-08-05 06:45:12,120][00139] DAMAGECOUNT value on done: 33219.0 [2024-08-05 06:45:12,121][00139] Sum rewards: -1.234, reward structure: {'DEATHCOUNT': '-9.000', 'AMMO2': '0.012', 'WEAPON1': '0.020', 'AMMO5': '0.030', 'weapon5': '0.056', 'AMMO4': '0.060', 'AMMO3': '0.122', 'HITCOUNT': '0.230', 'HEALTH': '0.292', 'WEAPON5': '0.450', 'WEAPON3': '0.500', 'DAMAGECOUNT': '0.825', 'weapon3': '1.578', 'weapon2': '1.590', 'FRAGCOUNT': '2.000'} [2024-08-05 06:45:12,352][00139] DAMAGECOUNT value on done: 33740.0 [2024-08-05 06:45:12,353][00139] Sum rewards: -0.185, reward structure: {'DEATHCOUNT': '-6.000', 'HEALTH': '-1.632', 'AMMO4': '-0.033', 'AMMO2': '-0.007', 'AMMO5': '0.005', 'ARMOR': '0.072', 'WEAPON4': '0.100', 'WEAPON5': '0.100', 'AMMO3': '0.147', 'HITCOUNT': '0.200', 'weapon4': '0.220', 'DAMAGECOUNT': '0.663', 'WEAPON3': '0.800', 'weapon2': '1.242', 'weapon3': '1.938', 'FRAGCOUNT': '2.000'} [2024-08-05 06:45:15,483][00034] Fps is (10 sec: 1638.4, 60 sec: 1228.8, 300 sec: 1166.3). Total num frames: 4235264. Throughput: 0: 283.6. Samples: 1058624. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 06:45:15,487][00034] Avg episode reward: [(0, '-3.532')] [2024-08-05 06:45:15,494][00132] Saving new best policy, reward=-3.532! [2024-08-05 06:45:20,483][00034] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1138.5). Total num frames: 4235264. Throughput: 0: 282.8. Samples: 1060325. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 06:45:20,485][00034] Avg episode reward: [(0, '-3.532')] [2024-08-05 06:45:25,483][00034] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1138.5). Total num frames: 4243456. Throughput: 0: 283.4. Samples: 1062073. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 06:45:25,484][00034] Avg episode reward: [(0, '-3.532')] [2024-08-05 06:45:26,794][00139] DAMAGECOUNT value on done: 33568.0 [2024-08-05 06:45:26,795][00139] Sum rewards: -7.175, reward structure: {'DEATHCOUNT': '-12.750', 'HEALTH': '-1.858', 'AMMO4': '-0.019', 'AMMO2': '-0.004', 'AMMO5': '0.010', 'ARMOR': '0.052', 'weapon5': '0.058', 'AMMO3': '0.181', 'WEAPON5': '0.200', 'HITCOUNT': '0.230', 'WEAPON3': '0.900', 'DAMAGECOUNT': '1.047', 'FRAGCOUNT': '1.500', 'weapon3': '1.564', 'weapon2': '1.714'} [2024-08-05 06:45:27,023][00139] DAMAGECOUNT value on done: 34089.0 [2024-08-05 06:45:27,024][00139] Sum rewards: -2.871, reward structure: {'DEATHCOUNT': '-9.750', 'HEALTH': '-2.134', 'weapon7': '0.002', 'AMMO2': '0.017', 'AMMO5': '0.018', 'weapon4': '0.052', 'AMMO4': '0.086', 'weapon5': '0.126', 'HITCOUNT': '0.150', 'AMMO3': '0.158', 'AMMO6': '0.200', 'WEAPON7': '0.200', 'AMMO7': '0.200', 'WEAPON4': '0.250', 'WEAPON5': '0.400', 'ARMOR': '0.461', 'WEAPON3': '0.800', 'DAMAGECOUNT': '1.047', 'weapon2': '1.260', 'weapon3': '1.586', 'FRAGCOUNT': '2.000'} [2024-08-05 06:45:30,483][00034] Fps is (10 sec: 1638.4, 60 sec: 1228.8, 300 sec: 1138.5). Total num frames: 4251648. Throughput: 0: 283.1. Samples: 1062921. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 06:45:30,484][00034] Avg episode reward: [(0, '-3.514')] [2024-08-05 06:45:30,489][00132] Saving new best policy, reward=-3.514! [2024-08-05 06:45:35,215][00138] Updated weights for policy 0, policy_version 520 (0.0017) [2024-08-05 06:45:35,483][00034] Fps is (10 sec: 1638.4, 60 sec: 1228.8, 300 sec: 1166.3). Total num frames: 4259840. Throughput: 0: 285.5. Samples: 1064671. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 06:45:35,484][00034] Avg episode reward: [(0, '-3.514')] [2024-08-05 06:45:35,493][00132] Saving /kaggle/working/train_dir/default_experiment/checkpoint_p0/checkpoint_000000520_4259840.pth... [2024-08-05 06:45:35,568][00132] Removing /kaggle/working/train_dir/default_experiment/checkpoint_p0/checkpoint_000000486_3981312.pth [2024-08-05 06:45:40,483][00034] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1138.5). Total num frames: 4259840. Throughput: 0: 286.4. Samples: 1066389. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 06:45:40,484][00034] Avg episode reward: [(0, '-3.514')] [2024-08-05 06:45:41,290][00139] DAMAGECOUNT value on done: 33949.0 [2024-08-05 06:45:41,291][00139] Sum rewards: -4.205, reward structure: {'DEATHCOUNT': '-10.500', 'HEALTH': '-1.438', 'AMMO2': '0.006', 'AMMO5': '0.007', 'AMMO4': '0.032', 'weapon5': '0.048', 'WEAPON4': '0.050', 'weapon7': '0.072', 'AMMO6': '0.100', 'AMMO7': '0.100', 'WEAPON7': '0.100', 'WEAPON5': '0.150', 'AMMO3': '0.159', 'weapon4': '0.170', 'HITCOUNT': '0.330', 'ARMOR': '0.448', 'WEAPON3': '0.750', 'FRAGCOUNT': '1.000', 'DAMAGECOUNT': '1.143', 'weapon2': '1.502', 'weapon3': '1.566'} [2024-08-05 06:45:41,509][00139] DAMAGECOUNT value on done: 34481.0 [2024-08-05 06:45:41,509][00139] Sum rewards: -5.657, reward structure: {'DEATHCOUNT': '-11.250', 'HEALTH': '-1.124', 'WEAPON1': '0.010', 'AMMO5': '0.012', 'AMMO2': '0.014', 'weapon5': '0.024', 'AMMO4': '0.070', 'AMMO3': '0.182', 'WEAPON5': '0.250', 'HITCOUNT': '0.300', 'FRAGCOUNT': '0.500', 'weapon2': '0.810', 'WEAPON3': '1.050', 'DAMAGECOUNT': '1.176', 'weapon3': '2.318'} [2024-08-05 06:45:45,483][00034] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1138.5). Total num frames: 4268032. Throughput: 0: 287.4. Samples: 1067273. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 06:45:45,484][00034] Avg episode reward: [(0, '-3.482')] [2024-08-05 06:45:45,492][00132] Saving new best policy, reward=-3.482! [2024-08-05 06:45:50,483][00034] Fps is (10 sec: 1638.4, 60 sec: 1228.8, 300 sec: 1166.3). Total num frames: 4276224. Throughput: 0: 288.2. Samples: 1069010. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 06:45:50,485][00034] Avg episode reward: [(0, '-3.482')] [2024-08-05 06:45:55,483][00034] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1138.5). Total num frames: 4276224. Throughput: 0: 288.2. Samples: 1070734. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 06:45:55,484][00034] Avg episode reward: [(0, '-3.482')] [2024-08-05 06:45:55,814][00139] DAMAGECOUNT value on done: 34199.0 [2024-08-05 06:45:55,815][00139] Sum rewards: -4.714, reward structure: {'DEATHCOUNT': '-8.250', 'HEALTH': '-1.670', 'FRAGCOUNT': '-0.500', 'WEAPON1': '0.010', 'AMMO5': '0.010', 'weapon4': '0.014', 'AMMO2': '0.019', 'weapon5': '0.050', 'AMMO4': '0.097', 'WEAPON4': '0.100', 'AMMO3': '0.120', 'HITCOUNT': '0.140', 'WEAPON5': '0.250', 'ARMOR': '0.485', 'WEAPON3': '0.750', 'DAMAGECOUNT': '0.750', 'weapon2': '1.160', 'weapon3': '1.750'} [2024-08-05 06:45:56,023][00139] DAMAGECOUNT value on done: 34853.0 [2024-08-05 06:45:56,023][00139] Sum rewards: -2.248, reward structure: {'DEATHCOUNT': '-8.250', 'HEALTH': '-1.033', 'AMMO5': '0.007', 'ARMOR': '0.012', 'AMMO2': '0.019', 'WEAPON1': '0.020', 'weapon5': '0.058', 'weapon4': '0.092', 'AMMO4': '0.096', 'AMMO3': '0.118', 'WEAPON5': '0.150', 'WEAPON4': '0.150', 'HITCOUNT': '0.280', 'WEAPON3': '0.700', 'FRAGCOUNT': '1.000', 'DAMAGECOUNT': '1.116', 'weapon2': '1.384', 'weapon3': '1.832'} [2024-08-05 06:46:00,483][00034] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1138.5). Total num frames: 4284416. Throughput: 0: 288.1. Samples: 1071587. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 06:46:00,484][00034] Avg episode reward: [(0, '-3.500')] [2024-08-05 06:46:05,483][00034] Fps is (10 sec: 1638.4, 60 sec: 1228.8, 300 sec: 1166.3). Total num frames: 4292608. Throughput: 0: 287.8. Samples: 1073278. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 06:46:05,485][00034] Avg episode reward: [(0, '-3.500')] [2024-08-05 06:46:10,483][00034] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1138.5). Total num frames: 4292608. Throughput: 0: 287.3. Samples: 1075002. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 06:46:10,484][00034] Avg episode reward: [(0, '-3.500')] [2024-08-05 06:46:10,584][00139] DAMAGECOUNT value on done: 34228.0 [2024-08-05 06:46:10,807][00139] DAMAGECOUNT value on done: 35033.0 [2024-08-05 06:46:10,808][00139] Sum rewards: -0.894, reward structure: {'DEATHCOUNT': '-6.750', 'HEALTH': '-1.168', 'weapon4': '0.002', 'AMMO2': '0.014', 'AMMO5': '0.016', 'WEAPON1': '0.020', 'WEAPON4': '0.050', 'AMMO4': '0.069', 'weapon5': '0.076', 'AMMO3': '0.105', 'HITCOUNT': '0.150', 'WEAPON5': '0.350', 'DAMAGECOUNT': '0.540', 'WEAPON3': '0.650', 'weapon3': '1.188', 'weapon2': '1.794', 'FRAGCOUNT': '2.000'} [2024-08-05 06:46:15,483][00034] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1138.5). Total num frames: 4300800. Throughput: 0: 287.3. Samples: 1075850. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 06:46:15,484][00034] Avg episode reward: [(0, '-3.482')] [2024-08-05 06:46:20,483][00034] Fps is (10 sec: 1638.4, 60 sec: 1228.8, 300 sec: 1138.5). Total num frames: 4308992. Throughput: 0: 287.8. Samples: 1077624. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 06:46:20,485][00034] Avg episode reward: [(0, '-3.482')] [2024-08-05 06:46:25,234][00139] DAMAGECOUNT value on done: 34365.0 [2024-08-05 06:46:25,456][00139] DAMAGECOUNT value on done: 35168.0 [2024-08-05 06:46:25,457][00139] Sum rewards: -11.428, reward structure: {'DEATHCOUNT': '-13.500', 'HEALTH': '-3.120', 'FRAGCOUNT': '-0.500', 'ARMOR': '0.012', 'weapon5': '0.018', 'WEAPON1': '0.020', 'AMMO5': '0.025', 'AMMO2': '0.038', 'HITCOUNT': '0.100', 'AMMO3': '0.164', 'AMMO4': '0.191', 'weapon4': '0.274', 'WEAPON4': '0.350', 'WEAPON5': '0.400', 'DAMAGECOUNT': '0.405', 'WEAPON3': '0.950', 'weapon3': '0.994', 'weapon2': '1.750'} [2024-08-05 06:46:25,483][00034] Fps is (10 sec: 1638.4, 60 sec: 1228.8, 300 sec: 1166.3). Total num frames: 4317184. Throughput: 0: 287.1. Samples: 1079307. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 06:46:25,484][00034] Avg episode reward: [(0, '-3.525')] [2024-08-05 06:46:30,483][00034] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1138.5). Total num frames: 4317184. Throughput: 0: 286.6. Samples: 1080171. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 06:46:30,484][00034] Avg episode reward: [(0, '-3.525')] [2024-08-05 06:46:35,483][00034] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1138.5). Total num frames: 4325376. Throughput: 0: 286.9. Samples: 1081919. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 06:46:35,484][00034] Avg episode reward: [(0, '-3.525')] [2024-08-05 06:46:39,757][00139] DAMAGECOUNT value on done: 34791.0 [2024-08-05 06:46:39,757][00139] Sum rewards: -0.808, reward structure: {'DEATHCOUNT': '-9.000', 'HEALTH': '-0.699', 'AMMO2': '0.019', 'WEAPON1': '0.020', 'AMMO5': '0.025', 'weapon5': '0.072', 'AMMO3': '0.090', 'AMMO4': '0.096', 'HITCOUNT': '0.160', 'WEAPON5': '0.450', 'WEAPON3': '0.550', 'DAMAGECOUNT': '1.278', 'weapon3': '1.434', 'weapon2': '1.696', 'FRAGCOUNT': '3.000'} [2024-08-05 06:46:39,987][00139] DAMAGECOUNT value on done: 35352.0 [2024-08-05 06:46:40,483][00034] Fps is (10 sec: 1638.4, 60 sec: 1228.8, 300 sec: 1166.3). Total num frames: 4333568. Throughput: 0: 286.7. Samples: 1083637. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 06:46:40,485][00034] Avg episode reward: [(0, '-3.498')] [2024-08-05 06:46:45,483][00034] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1138.5). Total num frames: 4333568. Throughput: 0: 287.3. Samples: 1084517. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 06:46:45,484][00034] Avg episode reward: [(0, '-3.498')] [2024-08-05 06:46:46,394][00138] Updated weights for policy 0, policy_version 530 (0.0017) [2024-08-05 06:46:50,483][00034] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1138.5). Total num frames: 4341760. Throughput: 0: 287.6. Samples: 1086221. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 06:46:50,484][00034] Avg episode reward: [(0, '-3.498')] [2024-08-05 06:46:54,694][00139] DAMAGECOUNT value on done: 34849.0 [2024-08-05 06:46:54,694][00139] Sum rewards: -3.512, reward structure: {'DEATHCOUNT': '-8.250', 'HEALTH': '-0.988', 'AMMO5': '0.007', 'WEAPON1': '0.020', 'AMMO2': '0.030', 'weapon5': '0.050', 'ARMOR': '0.056', 'HITCOUNT': '0.070', 'AMMO3': '0.104', 'AMMO4': '0.148', 'DAMAGECOUNT': '0.174', 'weapon4': '0.184', 'WEAPON5': '0.200', 'WEAPON4': '0.250', 'WEAPON3': '0.550', 'FRAGCOUNT': '1.000', 'weapon3': '1.398', 'weapon2': '1.484'} [2024-08-05 06:46:54,922][00139] DAMAGECOUNT value on done: 35664.0 [2024-08-05 06:46:54,923][00139] Sum rewards: -4.362, reward structure: {'DEATHCOUNT': '-10.500', 'HEALTH': '-1.936', 'AMMO5': '0.005', 'AMMO2': '0.012', 'WEAPON1': '0.020', 'AMMO4': '0.058', 'ARMOR': '0.068', 'weapon4': '0.096', 'WEAPON5': '0.100', 'WEAPON4': '0.150', 'AMMO3': '0.237', 'HITCOUNT': '0.250', 'DAMAGECOUNT': '0.936', 'WEAPON3': '1.000', 'weapon2': '1.292', 'weapon3': '1.850', 'FRAGCOUNT': '2.000'} [2024-08-05 06:46:55,483][00034] Fps is (10 sec: 1638.4, 60 sec: 1228.8, 300 sec: 1138.5). Total num frames: 4349952. Throughput: 0: 285.6. Samples: 1087855. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 06:46:55,484][00034] Avg episode reward: [(0, '-3.472')] [2024-08-05 06:46:55,492][00132] Saving new best policy, reward=-3.472! [2024-08-05 06:47:00,483][00034] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1138.5). Total num frames: 4349952. Throughput: 0: 286.3. Samples: 1088732. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 06:47:00,484][00034] Avg episode reward: [(0, '-3.472')] [2024-08-05 06:47:05,483][00034] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1138.5). Total num frames: 4358144. Throughput: 0: 285.3. Samples: 1090463. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 06:47:05,484][00034] Avg episode reward: [(0, '-3.472')] [2024-08-05 06:47:09,279][00139] DAMAGECOUNT value on done: 35034.0 [2024-08-05 06:47:09,279][00139] Sum rewards: -1.831, reward structure: {'DEATHCOUNT': '-8.250', 'HEALTH': '-0.284', 'weapon5': '0.002', 'AMMO5': '0.003', 'AMMO2': '0.010', 'WEAPON1': '0.010', 'weapon4': '0.028', 'ARMOR': '0.048', 'AMMO4': '0.049', 'WEAPON4': '0.050', 'WEAPON5': '0.050', 'AMMO3': '0.128', 'HITCOUNT': '0.130', 'DAMAGECOUNT': '0.555', 'WEAPON3': '0.650', 'weapon2': '1.402', 'weapon3': '1.588', 'FRAGCOUNT': '2.000'} [2024-08-05 06:47:09,514][00139] DAMAGECOUNT value on done: 35989.0 [2024-08-05 06:47:09,514][00139] Sum rewards: -4.611, reward structure: {'DEATHCOUNT': '-10.500', 'HEALTH': '-0.952', 'weapon5': '0.008', 'AMMO5': '0.015', 'AMMO2': '0.018', 'WEAPON1': '0.020', 'ARMOR': '0.036', 'weapon4': '0.084', 'AMMO4': '0.089', 'WEAPON4': '0.100', 'AMMO3': '0.136', 'HITCOUNT': '0.240', 'WEAPON5': '0.250', 'WEAPON3': '0.800', 'DAMAGECOUNT': '0.975', 'FRAGCOUNT': '1.000', 'weapon3': '1.530', 'weapon2': '1.540'} [2024-08-05 06:47:10,483][00034] Fps is (10 sec: 1638.4, 60 sec: 1228.8, 300 sec: 1138.5). Total num frames: 4366336. Throughput: 0: 286.3. Samples: 1092192. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 06:47:10,484][00034] Avg episode reward: [(0, '-3.383')] [2024-08-05 06:47:10,486][00132] Saving new best policy, reward=-3.383! [2024-08-05 06:47:15,483][00034] Fps is (10 sec: 1638.4, 60 sec: 1228.8, 300 sec: 1166.3). Total num frames: 4374528. Throughput: 0: 287.0. Samples: 1093084. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 06:47:15,485][00034] Avg episode reward: [(0, '-3.383')] [2024-08-05 06:47:20,483][00034] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1138.5). Total num frames: 4374528. Throughput: 0: 286.9. Samples: 1094830. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 06:47:20,484][00034] Avg episode reward: [(0, '-3.383')] [2024-08-05 06:47:23,652][00139] DAMAGECOUNT value on done: 35401.0 [2024-08-05 06:47:23,652][00139] Sum rewards: -1.973, reward structure: {'DEATHCOUNT': '-10.500', 'HEALTH': '-0.974', 'AMMO2': '0.014', 'WEAPON1': '0.020', 'weapon5': '0.022', 'AMMO5': '0.023', 'ARMOR': '0.052', 'AMMO4': '0.072', 'weapon4': '0.122', 'WEAPON4': '0.150', 'AMMO3': '0.215', 'HITCOUNT': '0.270', 'WEAPON5': '0.400', 'weapon2': '1.050', 'WEAPON3': '1.100', 'DAMAGECOUNT': '1.101', 'weapon3': '1.890', 'FRAGCOUNT': '3.000'} [2024-08-05 06:47:23,895][00139] DAMAGECOUNT value on done: 36124.0 [2024-08-05 06:47:23,895][00139] Sum rewards: -5.840, reward structure: {'DEATHCOUNT': '-10.500', 'HEALTH': '-1.754', 'AMMO2': '0.011', 'AMMO5': '0.020', 'WEAPON1': '0.020', 'weapon5': '0.050', 'AMMO4': '0.056', 'weapon4': '0.064', 'HITCOUNT': '0.100', 'WEAPON4': '0.150', 'AMMO3': '0.194', 'WEAPON5': '0.400', 'DAMAGECOUNT': '0.405', 'FRAGCOUNT': '1.000', 'WEAPON3': '1.050', 'weapon2': '1.122', 'weapon3': '1.772'} [2024-08-05 06:47:25,483][00034] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1138.5). Total num frames: 4382720. Throughput: 0: 286.5. Samples: 1096529. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 06:47:25,485][00034] Avg episode reward: [(0, '-3.361')] [2024-08-05 06:47:25,492][00132] Saving new best policy, reward=-3.361! [2024-08-05 06:47:30,483][00034] Fps is (10 sec: 1638.4, 60 sec: 1228.8, 300 sec: 1166.3). Total num frames: 4390912. Throughput: 0: 285.3. Samples: 1097356. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 06:47:30,484][00034] Avg episode reward: [(0, '-3.361')] [2024-08-05 06:47:35,483][00034] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1138.5). Total num frames: 4390912. Throughput: 0: 285.9. Samples: 1099086. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 06:47:35,485][00034] Avg episode reward: [(0, '-3.361')] [2024-08-05 06:47:35,495][00132] Saving /kaggle/working/train_dir/default_experiment/checkpoint_p0/checkpoint_000000536_4390912.pth... [2024-08-05 06:47:35,571][00132] Removing /kaggle/working/train_dir/default_experiment/checkpoint_p0/checkpoint_000000503_4120576.pth [2024-08-05 06:47:38,478][00139] DAMAGECOUNT value on done: 35438.0 [2024-08-05 06:47:38,479][00139] Sum rewards: -9.450, reward structure: {'DEATHCOUNT': '-12.000', 'FRAGCOUNT': '-1.500', 'HEALTH': '-0.826', 'weapon5': '0.008', 'AMMO5': '0.010', 'WEAPON1': '0.010', 'AMMO2': '0.037', 'ARMOR': '0.040', 'HITCOUNT': '0.050', 'DAMAGECOUNT': '0.111', 'AMMO3': '0.145', 'AMMO4': '0.185', 'WEAPON5': '0.200', 'weapon4': '0.268', 'WEAPON4': '0.350', 'WEAPON3': '0.750', 'weapon2': '1.270', 'weapon3': '1.442'} [2024-08-05 06:47:38,687][00139] DAMAGECOUNT value on done: 36199.0 [2024-08-05 06:47:40,483][00034] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1138.5). Total num frames: 4399104. Throughput: 0: 287.9. Samples: 1100809. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 06:47:40,484][00034] Avg episode reward: [(0, '-3.415')] [2024-08-05 06:47:45,483][00034] Fps is (10 sec: 1638.4, 60 sec: 1228.8, 300 sec: 1166.3). Total num frames: 4407296. Throughput: 0: 287.9. Samples: 1101688. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 06:47:45,484][00034] Avg episode reward: [(0, '-3.415')] [2024-08-05 06:47:50,483][00034] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1138.5). Total num frames: 4407296. Throughput: 0: 288.5. Samples: 1103444. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 06:47:50,484][00034] Avg episode reward: [(0, '-3.415')] [2024-08-05 06:47:52,884][00139] DAMAGECOUNT value on done: 35638.0 [2024-08-05 06:47:52,884][00139] Sum rewards: -2.798, reward structure: {'DEATHCOUNT': '-8.250', 'HEALTH': '-1.740', 'AMMO5': '0.005', 'WEAPON1': '0.010', 'weapon5': '0.018', 'AMMO2': '0.023', 'ARMOR': '0.080', 'WEAPON5': '0.100', 'AMMO4': '0.114', 'AMMO3': '0.114', 'HITCOUNT': '0.160', 'weapon4': '0.162', 'WEAPON4': '0.250', 'DAMAGECOUNT': '0.600', 'WEAPON3': '0.650', 'weapon3': '1.346', 'weapon2': '1.560', 'FRAGCOUNT': '2.000'} [2024-08-05 06:47:53,105][00139] DAMAGECOUNT value on done: 36294.0 [2024-08-05 06:47:53,106][00139] Sum rewards: -2.564, reward structure: {'DEATHCOUNT': '-6.750', 'HEALTH': '-1.040', 'AMMO5': '0.003', 'AMMO2': '0.009', 'AMMO4': '0.044', 'weapon5': '0.044', 'WEAPON5': '0.050', 'WEAPON4': '0.050', 'ARMOR': '0.052', 'HITCOUNT': '0.100', 'AMMO3': '0.106', 'weapon4': '0.146', 'DAMAGECOUNT': '0.285', 'WEAPON3': '0.600', 'FRAGCOUNT': '1.000', 'weapon3': '1.334', 'weapon2': '1.404'} [2024-08-05 06:47:55,483][00034] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1138.5). Total num frames: 4415488. Throughput: 0: 289.0. Samples: 1105199. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 06:47:55,484][00034] Avg episode reward: [(0, '-3.348')] [2024-08-05 06:47:55,493][00132] Saving new best policy, reward=-3.348! [2024-08-05 06:47:57,806][00138] Updated weights for policy 0, policy_version 540 (0.0017) [2024-08-05 06:48:00,483][00034] Fps is (10 sec: 1638.4, 60 sec: 1228.8, 300 sec: 1138.5). Total num frames: 4423680. Throughput: 0: 287.6. Samples: 1106025. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 06:48:00,484][00034] Avg episode reward: [(0, '-3.348')] [2024-08-05 06:48:05,483][00034] Fps is (10 sec: 1638.4, 60 sec: 1228.8, 300 sec: 1166.3). Total num frames: 4431872. Throughput: 0: 288.2. Samples: 1107798. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 06:48:05,484][00034] Avg episode reward: [(0, '-3.348')] [2024-08-05 06:48:07,325][00139] DAMAGECOUNT value on done: 35776.0 [2024-08-05 06:48:07,325][00139] Sum rewards: -1.298, reward structure: {'DEATHCOUNT': '-7.500', 'ARMOR': '0.004', 'AMMO5': '0.005', 'weapon5': '0.010', 'AMMO2': '0.020', 'WEAPON1': '0.030', 'weapon7': '0.074', 'AMMO3': '0.089', 'AMMO4': '0.100', 'AMMO6': '0.100', 'AMMO7': '0.100', 'WEAPON7': '0.100', 'WEAPON5': '0.100', 'WEAPON4': '0.100', 'HITCOUNT': '0.140', 'weapon4': '0.214', 'DAMAGECOUNT': '0.414', 'WEAPON3': '0.450', 'HEALTH': '0.544', 'FRAGCOUNT': '1.000', 'weapon3': '1.116', 'weapon2': '1.492'} [2024-08-05 06:48:07,543][00139] DAMAGECOUNT value on done: 36349.0 [2024-08-05 06:48:10,483][00034] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1138.5). Total num frames: 4431872. Throughput: 0: 289.7. Samples: 1109565. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 06:48:10,485][00034] Avg episode reward: [(0, '-3.333')] [2024-08-05 06:48:10,487][00132] Saving new best policy, reward=-3.333! [2024-08-05 06:48:15,483][00034] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1138.5). Total num frames: 4440064. Throughput: 0: 290.3. Samples: 1110419. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 06:48:15,484][00034] Avg episode reward: [(0, '-3.333')] [2024-08-05 06:48:17,146][00139] Large shaping reward -2.504 for [('FRAGCOUNT', -1.5, -1.0), ('DEATHCOUNT', -0.75, 1.0), ('HEALTH', -0.255, -85.0), ('AMMO5', -0.0005, -1.0), ('weapon5', 0.002)] [2024-08-05 06:48:20,483][00034] Fps is (10 sec: 1638.4, 60 sec: 1228.8, 300 sec: 1166.3). Total num frames: 4448256. Throughput: 0: 289.3. Samples: 1112104. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 06:48:20,484][00034] Avg episode reward: [(0, '-3.333')] [2024-08-05 06:48:21,980][00139] DAMAGECOUNT value on done: 35836.0 [2024-08-05 06:48:21,980][00139] Sum rewards: -12.237, reward structure: {'DEATHCOUNT': '-14.250', 'FRAGCOUNT': '-1.500', 'HEALTH': '-1.480', 'ARMOR': '0.004', 'AMMO5': '0.010', 'weapon5': '0.020', 'AMMO2': '0.031', 'HITCOUNT': '0.050', 'AMMO4': '0.157', 'DAMAGECOUNT': '0.180', 'AMMO3': '0.185', 'WEAPON5': '0.200', 'weapon4': '0.278', 'WEAPON4': '0.300', 'WEAPON3': '0.900', 'weapon2': '1.232', 'weapon3': '1.446'} [2024-08-05 06:48:22,209][00139] DAMAGECOUNT value on done: 36394.0 [2024-08-05 06:48:25,483][00034] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1138.5). Total num frames: 4448256. Throughput: 0: 290.1. Samples: 1113865. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 06:48:25,484][00034] Avg episode reward: [(0, '-3.457')] [2024-08-05 06:48:30,483][00034] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1138.5). Total num frames: 4456448. Throughput: 0: 289.1. Samples: 1114698. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 06:48:30,484][00034] Avg episode reward: [(0, '-3.457')] [2024-08-05 06:48:35,483][00034] Fps is (10 sec: 1638.4, 60 sec: 1228.8, 300 sec: 1166.3). Total num frames: 4464640. Throughput: 0: 288.8. Samples: 1116441. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 06:48:35,484][00034] Avg episode reward: [(0, '-3.457')] [2024-08-05 06:48:36,497][00139] DAMAGECOUNT value on done: 36006.0 [2024-08-05 06:48:36,497][00139] Sum rewards: -1.455, reward structure: {'DEATHCOUNT': '-9.750', 'AMMO5': '0.005', 'AMMO2': '0.021', 'ARMOR': '0.048', 'weapon5': '0.056', 'AMMO3': '0.103', 'AMMO4': '0.104', 'HITCOUNT': '0.120', 'WEAPON5': '0.150', 'WEAPON4': '0.200', 'HEALTH': '0.250', 'weapon4': '0.390', 'DAMAGECOUNT': '0.510', 'WEAPON3': '0.600', 'weapon3': '1.270', 'weapon2': '1.468', 'FRAGCOUNT': '3.000'} [2024-08-05 06:48:36,702][00139] DAMAGECOUNT value on done: 36542.0 [2024-08-05 06:48:36,703][00139] Sum rewards: 0.947, reward structure: {'DEATHCOUNT': '-4.500', 'HEALTH': '-0.087', 'AMMO4': '-0.018', 'AMMO2': '-0.004', 'AMMO5': '0.005', 'WEAPON4': '0.100', 'AMMO3': '0.119', 'HITCOUNT': '0.130', 'weapon4': '0.202', 'DAMAGECOUNT': '0.444', 'WEAPON3': '0.450', 'ARMOR': '0.516', 'FRAGCOUNT': '1.000', 'weapon2': '1.184', 'weapon3': '1.406'} [2024-08-05 06:48:40,483][00034] Fps is (10 sec: 1638.4, 60 sec: 1228.8, 300 sec: 1166.3). Total num frames: 4472832. Throughput: 0: 288.4. Samples: 1118179. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 06:48:40,484][00034] Avg episode reward: [(0, '-3.456')] [2024-08-05 06:48:45,483][00034] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1138.5). Total num frames: 4472832. Throughput: 0: 289.0. Samples: 1119029. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 06:48:45,485][00034] Avg episode reward: [(0, '-3.456')] [2024-08-05 06:48:50,483][00034] Fps is (10 sec: 819.2, 60 sec: 1228.8, 300 sec: 1138.5). Total num frames: 4481024. Throughput: 0: 288.0. Samples: 1120758. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 06:48:50,484][00034] Avg episode reward: [(0, '-3.456')] [2024-08-05 06:48:51,138][00139] DAMAGECOUNT value on done: 36283.0 [2024-08-05 06:48:51,139][00139] Sum rewards: -5.838, reward structure: {'DEATHCOUNT': '-10.500', 'HEALTH': '-2.522', 'AMMO2': '0.005', 'AMMO5': '0.013', 'weapon5': '0.022', 'AMMO4': '0.025', 'ARMOR': '0.030', 'weapon4': '0.090', 'WEAPON4': '0.100', 'AMMO3': '0.108', 'HITCOUNT': '0.200', 'WEAPON5': '0.250', 'WEAPON3': '0.600', 'DAMAGECOUNT': '0.831', 'weapon2': '1.274', 'weapon3': '1.636', 'FRAGCOUNT': '2.000'} [2024-08-05 06:48:51,368][00139] DAMAGECOUNT value on done: 36594.0 [2024-08-05 06:48:55,483][00034] Fps is (10 sec: 1638.4, 60 sec: 1228.8, 300 sec: 1166.3). Total num frames: 4489216. Throughput: 0: 287.4. Samples: 1122498. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 06:48:55,486][00034] Avg episode reward: [(0, '-3.510')] [2024-08-05 06:49:00,483][00034] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1138.5). Total num frames: 4489216. Throughput: 0: 288.4. Samples: 1123397. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 06:49:00,484][00034] Avg episode reward: [(0, '-3.510')] [2024-08-05 06:49:05,483][00034] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1138.5). Total num frames: 4497408. Throughput: 0: 289.1. Samples: 1125112. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 06:49:05,485][00034] Avg episode reward: [(0, '-3.510')] [2024-08-05 06:49:05,577][00139] DAMAGECOUNT value on done: 36529.0 [2024-08-05 06:49:05,578][00139] Sum rewards: -0.629, reward structure: {'DEATHCOUNT': '-6.750', 'HEALTH': '-1.362', 'AMMO5': '0.003', 'AMMO2': '0.018', 'WEAPON5': '0.050', 'ARMOR': '0.056', 'AMMO4': '0.088', 'AMMO3': '0.131', 'weapon4': '0.160', 'WEAPON4': '0.200', 'HITCOUNT': '0.240', 'DAMAGECOUNT': '0.738', 'WEAPON3': '0.800', 'weapon2': '1.078', 'weapon3': '1.922', 'FRAGCOUNT': '2.000'} [2024-08-05 06:49:05,799][00139] DAMAGECOUNT value on done: 36764.0 [2024-08-05 06:49:05,799][00139] Sum rewards: -2.936, reward structure: {'DEATHCOUNT': '-8.250', 'HEALTH': '-0.340', 'AMMO5': '0.020', 'WEAPON1': '0.020', 'AMMO2': '0.034', 'ARMOR': '0.064', 'HITCOUNT': '0.100', 'AMMO3': '0.104', 'weapon5': '0.114', 'weapon4': '0.148', 'AMMO4': '0.171', 'WEAPON4': '0.200', 'WEAPON5': '0.300', 'FRAGCOUNT': '0.500', 'DAMAGECOUNT': '0.510', 'WEAPON3': '0.600', 'weapon2': '1.372', 'weapon3': '1.396'} [2024-08-05 06:49:08,644][00138] Updated weights for policy 0, policy_version 550 (0.0017) [2024-08-05 06:49:10,483][00034] Fps is (10 sec: 1638.4, 60 sec: 1228.8, 300 sec: 1166.3). Total num frames: 4505600. Throughput: 0: 288.0. Samples: 1126825. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 06:49:10,485][00034] Avg episode reward: [(0, '-3.531')] [2024-08-05 06:49:15,483][00034] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1138.5). Total num frames: 4505600. Throughput: 0: 289.2. Samples: 1127711. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 06:49:15,484][00034] Avg episode reward: [(0, '-3.531')] [2024-08-05 06:49:20,112][00139] DAMAGECOUNT value on done: 36651.0 [2024-08-05 06:49:20,333][00139] DAMAGECOUNT value on done: 37134.0 [2024-08-05 06:49:20,334][00139] Sum rewards: -1.673, reward structure: {'DEATHCOUNT': '-9.750', 'HEALTH': '-0.455', 'AMMO2': '0.010', 'AMMO5': '0.010', 'AMMO4': '0.047', 'AMMO3': '0.119', 'WEAPON4': '0.150', 'WEAPON5': '0.150', 'HITCOUNT': '0.210', 'weapon4': '0.254', 'WEAPON3': '0.550', 'DAMAGECOUNT': '1.110', 'weapon3': '1.410', 'weapon2': '1.512', 'FRAGCOUNT': '3.000'} [2024-08-05 06:49:20,483][00034] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1138.5). Total num frames: 4513792. Throughput: 0: 289.1. Samples: 1129450. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 06:49:20,484][00034] Avg episode reward: [(0, '-3.540')] [2024-08-05 06:49:25,483][00034] Fps is (10 sec: 1638.4, 60 sec: 1228.8, 300 sec: 1166.3). Total num frames: 4521984. Throughput: 0: 289.8. Samples: 1131218. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 06:49:25,484][00034] Avg episode reward: [(0, '-3.540')] [2024-08-05 06:49:30,483][00034] Fps is (10 sec: 1638.4, 60 sec: 1228.8, 300 sec: 1166.3). Total num frames: 4530176. Throughput: 0: 291.2. Samples: 1132131. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 06:49:30,485][00034] Avg episode reward: [(0, '-3.540')] [2024-08-05 06:49:34,586][00139] DAMAGECOUNT value on done: 36845.0 [2024-08-05 06:49:34,586][00139] Sum rewards: -9.811, reward structure: {'DEATHCOUNT': '-13.500', 'HEALTH': '-3.202', 'AMMO5': '0.005', 'AMMO2': '0.007', 'weapon5': '0.016', 'WEAPON1': '0.020', 'AMMO4': '0.033', 'weapon4': '0.058', 'WEAPON5': '0.100', 'WEAPON4': '0.100', 'ARMOR': '0.112', 'HITCOUNT': '0.170', 'AMMO3': '0.244', 'DAMAGECOUNT': '0.582', 'FRAGCOUNT': '1.000', 'weapon2': '1.084', 'WEAPON3': '1.500', 'weapon3': '1.860'} [2024-08-05 06:49:34,801][00139] DAMAGECOUNT value on done: 37302.0 [2024-08-05 06:49:35,483][00034] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1138.5). Total num frames: 4530176. Throughput: 0: 289.9. Samples: 1133805. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 06:49:35,485][00034] Avg episode reward: [(0, '-3.676')] [2024-08-05 06:49:35,492][00132] Saving /kaggle/working/train_dir/default_experiment/checkpoint_p0/checkpoint_000000553_4530176.pth... [2024-08-05 06:49:35,568][00132] Removing /kaggle/working/train_dir/default_experiment/checkpoint_p0/checkpoint_000000520_4259840.pth [2024-08-05 06:49:40,483][00034] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1138.5). Total num frames: 4538368. Throughput: 0: 289.5. Samples: 1135524. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 06:49:40,485][00034] Avg episode reward: [(0, '-3.676')] [2024-08-05 06:49:45,483][00034] Fps is (10 sec: 1638.4, 60 sec: 1228.8, 300 sec: 1166.3). Total num frames: 4546560. Throughput: 0: 289.4. Samples: 1136421. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 06:49:45,485][00034] Avg episode reward: [(0, '-3.676')] [2024-08-05 06:49:49,042][00139] DAMAGECOUNT value on done: 36870.0 [2024-08-05 06:49:49,263][00139] DAMAGECOUNT value on done: 37503.0 [2024-08-05 06:49:49,263][00139] Sum rewards: -3.134, reward structure: {'DEATHCOUNT': '-9.000', 'HEALTH': '-0.945', 'weapon5': '0.010', 'WEAPON1': '0.020', 'AMMO5': '0.020', 'AMMO2': '0.031', 'weapon4': '0.058', 'ARMOR': '0.104', 'AMMO4': '0.156', 'HITCOUNT': '0.180', 'AMMO3': '0.191', 'WEAPON4': '0.200', 'WEAPON5': '0.300', 'DAMAGECOUNT': '0.603', 'weapon2': '0.858', 'WEAPON3': '0.900', 'FRAGCOUNT': '1.000', 'weapon3': '2.180'} [2024-08-05 06:49:50,483][00034] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1138.6). Total num frames: 4546560. Throughput: 0: 290.0. Samples: 1138160. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 06:49:50,485][00034] Avg episode reward: [(0, '-3.693')] [2024-08-05 06:49:55,483][00034] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1138.5). Total num frames: 4554752. Throughput: 0: 290.7. Samples: 1139908. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 06:49:55,484][00034] Avg episode reward: [(0, '-3.693')] [2024-08-05 06:50:00,483][00034] Fps is (10 sec: 1638.4, 60 sec: 1228.8, 300 sec: 1166.3). Total num frames: 4562944. Throughput: 0: 290.9. Samples: 1140800. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 06:50:00,484][00034] Avg episode reward: [(0, '-3.693')] [2024-08-05 06:50:03,554][00139] DAMAGECOUNT value on done: 36985.0 [2024-08-05 06:50:03,554][00139] Sum rewards: -5.171, reward structure: {'DEATHCOUNT': '-9.750', 'HEALTH': '-0.708', 'FRAGCOUNT': '-0.500', 'AMMO5': '0.010', 'WEAPON1': '0.010', 'AMMO2': '0.034', 'HITCOUNT': '0.060', 'weapon5': '0.070', 'weapon4': '0.106', 'weapon7': '0.106', 'AMMO6': '0.120', 'AMMO7': '0.120', 'AMMO3': '0.122', 'WEAPON5': '0.150', 'AMMO4': '0.168', 'WEAPON7': '0.200', 'WEAPON4': '0.250', 'DAMAGECOUNT': '0.345', 'ARMOR': '0.468', 'WEAPON3': '0.600', 'weapon3': '0.896', 'weapon2': '1.952'} [2024-08-05 06:50:03,784][00139] DAMAGECOUNT value on done: 37513.0 [2024-08-05 06:50:03,784][00139] Sum rewards: -6.576, reward structure: {'DEATHCOUNT': '-7.500', 'HEALTH': '-1.795', 'FRAGCOUNT': '-1.500', 'AMMO2': '0.009', 'WEAPON1': '0.010', 'HITCOUNT': '0.010', 'AMMO5': '0.011', 'DAMAGECOUNT': '0.030', 'ARMOR': '0.040', 'AMMO4': '0.046', 'weapon5': '0.112', 'AMMO3': '0.133', 'WEAPON4': '0.150', 'weapon4': '0.152', 'WEAPON5': '0.200', 'WEAPON3': '0.700', 'weapon2': '1.244', 'weapon3': '1.372'} [2024-08-05 06:50:05,483][00034] Fps is (10 sec: 1638.4, 60 sec: 1228.8, 300 sec: 1166.3). Total num frames: 4571136. Throughput: 0: 289.9. Samples: 1142496. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 06:50:05,484][00034] Avg episode reward: [(0, '-3.722')] [2024-08-05 06:50:10,483][00034] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1138.5). Total num frames: 4571136. Throughput: 0: 289.3. Samples: 1144238. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 06:50:10,484][00034] Avg episode reward: [(0, '-3.722')] [2024-08-05 06:50:15,483][00034] Fps is (10 sec: 819.2, 60 sec: 1228.8, 300 sec: 1166.3). Total num frames: 4579328. Throughput: 0: 288.5. Samples: 1145114. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 06:50:15,484][00034] Avg episode reward: [(0, '-3.722')] [2024-08-05 06:50:17,958][00139] DAMAGECOUNT value on done: 37230.0 [2024-08-05 06:50:17,958][00139] Sum rewards: -6.482, reward structure: {'DEATHCOUNT': '-12.750', 'HEALTH': '-2.168', 'AMMO2': '0.007', 'WEAPON1': '0.010', 'AMMO5': '0.012', 'weapon5': '0.032', 'AMMO4': '0.035', 'ARMOR': '0.050', 'WEAPON4': '0.050', 'weapon4': '0.100', 'WEAPON5': '0.150', 'HITCOUNT': '0.190', 'AMMO3': '0.196', 'DAMAGECOUNT': '0.735', 'WEAPON3': '0.950', 'weapon3': '1.242', 'weapon2': '1.676', 'FRAGCOUNT': '3.000'} [2024-08-05 06:50:18,208][00139] DAMAGECOUNT value on done: 37805.0 [2024-08-05 06:50:18,208][00139] Sum rewards: 0.454, reward structure: {'DEATHCOUNT': '-7.500', 'HEALTH': '-2.000', 'AMMO4': '-0.026', 'AMMO2': '-0.005', 'AMMO5': '0.003', 'ARMOR': '0.008', 'weapon7': '0.096', 'AMMO3': '0.117', 'HITCOUNT': '0.180', 'AMMO6': '0.320', 'AMMO7': '0.320', 'WEAPON7': '0.400', 'WEAPON3': '0.700', 'DAMAGECOUNT': '0.876', 'weapon3': '1.450', 'weapon2': '1.516', 'FRAGCOUNT': '4.000'} [2024-08-05 06:50:19,309][00138] Updated weights for policy 0, policy_version 560 (0.0017) [2024-08-05 06:50:20,483][00034] Fps is (10 sec: 1638.4, 60 sec: 1228.8, 300 sec: 1166.3). Total num frames: 4587520. Throughput: 0: 290.0. Samples: 1146853. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 06:50:20,484][00034] Avg episode reward: [(0, '-3.695')] [2024-08-05 06:50:25,483][00034] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1138.5). Total num frames: 4587520. Throughput: 0: 291.0. Samples: 1148621. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 06:50:25,484][00034] Avg episode reward: [(0, '-3.695')] [2024-08-05 06:50:30,483][00034] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1138.5). Total num frames: 4595712. Throughput: 0: 290.2. Samples: 1149478. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 06:50:30,485][00034] Avg episode reward: [(0, '-3.695')] [2024-08-05 06:50:32,415][00139] DAMAGECOUNT value on done: 37595.0 [2024-08-05 06:50:32,416][00139] Sum rewards: -1.678, reward structure: {'DEATHCOUNT': '-8.250', 'HEALTH': '-1.694', 'AMMO4': '-0.007', 'AMMO2': '-0.001', 'ARMOR': '0.012', 'AMMO3': '0.147', 'HITCOUNT': '0.210', 'WEAPON3': '0.800', 'weapon2': '0.906', 'DAMAGECOUNT': '1.095', 'weapon3': '2.104', 'FRAGCOUNT': '3.000'} [2024-08-05 06:50:32,632][00139] DAMAGECOUNT value on done: 37840.0 [2024-08-05 06:50:35,483][00034] Fps is (10 sec: 1638.4, 60 sec: 1228.8, 300 sec: 1166.3). Total num frames: 4603904. Throughput: 0: 290.2. Samples: 1151217. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 06:50:35,485][00034] Avg episode reward: [(0, '-3.720')] [2024-08-05 06:50:40,483][00034] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1138.5). Total num frames: 4603904. Throughput: 0: 289.3. Samples: 1152927. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 06:50:40,484][00034] Avg episode reward: [(0, '-3.720')] [2024-08-05 06:50:45,483][00034] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1138.5). Total num frames: 4612096. Throughput: 0: 288.7. Samples: 1153790. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 06:50:45,485][00034] Avg episode reward: [(0, '-3.720')] [2024-08-05 06:50:46,977][00139] DAMAGECOUNT value on done: 37665.0 [2024-08-05 06:50:47,204][00139] DAMAGECOUNT value on done: 38000.0 [2024-08-05 06:50:47,205][00139] Sum rewards: -2.416, reward structure: {'DEATHCOUNT': '-9.750', 'WEAPON1': '0.010', 'AMMO2': '0.015', 'AMMO4': '0.076', 'HITCOUNT': '0.120', 'AMMO3': '0.154', 'DAMAGECOUNT': '0.480', 'WEAPON3': '0.700', 'HEALTH': '0.802', 'weapon2': '1.312', 'weapon3': '1.664', 'FRAGCOUNT': '2.000'} [2024-08-05 06:50:50,483][00034] Fps is (10 sec: 1638.4, 60 sec: 1228.8, 300 sec: 1166.3). Total num frames: 4620288. Throughput: 0: 289.8. Samples: 1155535. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 06:50:50,486][00034] Avg episode reward: [(0, '-3.728')] [2024-08-05 06:50:55,483][00034] Fps is (10 sec: 1638.4, 60 sec: 1228.8, 300 sec: 1166.3). Total num frames: 4628480. Throughput: 0: 289.8. Samples: 1157277. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 06:50:55,484][00034] Avg episode reward: [(0, '-3.728')] [2024-08-05 06:51:00,483][00034] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1138.5). Total num frames: 4628480. Throughput: 0: 289.3. Samples: 1158134. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 06:51:00,485][00034] Avg episode reward: [(0, '-3.728')] [2024-08-05 06:51:01,536][00139] DAMAGECOUNT value on done: 37725.0 [2024-08-05 06:51:01,812][00139] DAMAGECOUNT value on done: 38014.0 [2024-08-05 06:51:05,483][00034] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1166.3). Total num frames: 4636672. Throughput: 0: 289.3. Samples: 1159873. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 06:51:05,485][00034] Avg episode reward: [(0, '-3.852')] [2024-08-05 06:51:10,483][00034] Fps is (10 sec: 1638.4, 60 sec: 1228.8, 300 sec: 1166.3). Total num frames: 4644864. Throughput: 0: 287.2. Samples: 1161547. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 06:51:10,484][00034] Avg episode reward: [(0, '-3.852')] [2024-08-05 06:51:15,483][00034] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1138.5). Total num frames: 4644864. Throughput: 0: 287.7. Samples: 1162424. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 06:51:15,484][00034] Avg episode reward: [(0, '-3.852')] [2024-08-05 06:51:16,189][00139] DAMAGECOUNT value on done: 37750.0 [2024-08-05 06:51:16,413][00139] DAMAGECOUNT value on done: 38294.0 [2024-08-05 06:51:16,414][00139] Sum rewards: -2.083, reward structure: {'DEATHCOUNT': '-8.250', 'HEALTH': '-0.035', 'AMMO5': '0.005', 'weapon5': '0.008', 'AMMO2': '0.017', 'WEAPON5': '0.050', 'ARMOR': '0.076', 'AMMO4': '0.085', 'AMMO3': '0.115', 'weapon4': '0.144', 'WEAPON4': '0.150', 'HITCOUNT': '0.260', 'WEAPON3': '0.600', 'DAMAGECOUNT': '0.840', 'FRAGCOUNT': '1.000', 'weapon2': '1.292', 'weapon3': '1.560'} [2024-08-05 06:51:20,483][00034] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1138.5). Total num frames: 4653056. Throughput: 0: 288.0. Samples: 1164175. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 06:51:20,485][00034] Avg episode reward: [(0, '-3.821')] [2024-08-05 06:51:25,483][00034] Fps is (10 sec: 1638.4, 60 sec: 1228.8, 300 sec: 1166.3). Total num frames: 4661248. Throughput: 0: 289.0. Samples: 1165930. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 06:51:25,485][00034] Avg episode reward: [(0, '-3.821')] [2024-08-05 06:51:30,081][00138] Updated weights for policy 0, policy_version 570 (0.0018) [2024-08-05 06:51:30,483][00034] Fps is (10 sec: 1638.4, 60 sec: 1228.8, 300 sec: 1166.3). Total num frames: 4669440. Throughput: 0: 289.4. Samples: 1166815. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 06:51:30,484][00034] Avg episode reward: [(0, '-3.821')] [2024-08-05 06:51:30,666][00139] DAMAGECOUNT value on done: 37993.0 [2024-08-05 06:51:30,667][00139] Sum rewards: -4.990, reward structure: {'DEATHCOUNT': '-11.250', 'HEALTH': '-2.308', 'weapon5': '0.002', 'AMMO2': '0.004', 'AMMO5': '0.007', 'AMMO4': '0.019', 'WEAPON5': '0.100', 'AMMO3': '0.233', 'HITCOUNT': '0.240', 'DAMAGECOUNT': '0.729', 'weapon2': '1.086', 'WEAPON3': '1.150', 'weapon3': '1.998', 'FRAGCOUNT': '3.000'} [2024-08-05 06:51:30,887][00139] DAMAGECOUNT value on done: 38384.0 [2024-08-05 06:51:30,888][00139] Sum rewards: -1.143, reward structure: {'DEATHCOUNT': '-6.750', 'HEALTH': '-0.020', 'AMMO2': '0.027', 'HITCOUNT': '0.070', 'AMMO3': '0.081', 'AMMO4': '0.135', 'DAMAGECOUNT': '0.270', 'WEAPON4': '0.300', 'weapon4': '0.310', 'WEAPON3': '0.400', 'ARMOR': '0.530', 'FRAGCOUNT': '1.000', 'weapon2': '1.004', 'weapon3': '1.500'} [2024-08-05 06:51:35,483][00034] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1138.5). Total num frames: 4669440. Throughput: 0: 288.2. Samples: 1168506. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 06:51:35,484][00034] Avg episode reward: [(0, '-3.804')] [2024-08-05 06:51:35,494][00132] Saving /kaggle/working/train_dir/default_experiment/checkpoint_p0/checkpoint_000000570_4669440.pth... [2024-08-05 06:51:35,571][00132] Removing /kaggle/working/train_dir/default_experiment/checkpoint_p0/checkpoint_000000536_4390912.pth [2024-08-05 06:51:40,483][00034] Fps is (10 sec: 819.2, 60 sec: 1228.8, 300 sec: 1166.3). Total num frames: 4677632. Throughput: 0: 287.5. Samples: 1170213. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 06:51:40,484][00034] Avg episode reward: [(0, '-3.804')] [2024-08-05 06:51:45,349][00139] DAMAGECOUNT value on done: 38103.0 [2024-08-05 06:51:45,483][00034] Fps is (10 sec: 1638.4, 60 sec: 1228.8, 300 sec: 1166.3). Total num frames: 4685824. Throughput: 0: 288.0. Samples: 1171093. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 06:51:45,484][00034] Avg episode reward: [(0, '-3.748')] [2024-08-05 06:51:45,585][00139] DAMAGECOUNT value on done: 38464.0 [2024-08-05 06:51:50,483][00034] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1138.5). Total num frames: 4685824. Throughput: 0: 287.4. Samples: 1172805. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 06:51:50,484][00034] Avg episode reward: [(0, '-3.774')] [2024-08-05 06:51:55,483][00034] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1166.3). Total num frames: 4694016. Throughput: 0: 288.1. Samples: 1174513. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 06:51:55,485][00034] Avg episode reward: [(0, '-3.774')] [2024-08-05 06:52:00,090][00139] DAMAGECOUNT value on done: 38188.0 [2024-08-05 06:52:00,336][00139] DAMAGECOUNT value on done: 38528.0 [2024-08-05 06:52:00,337][00139] Sum rewards: -2.523, reward structure: {'DEATHCOUNT': '-8.250', 'AMMO5': '0.007', 'AMMO2': '0.036', 'ARMOR': '0.048', 'HITCOUNT': '0.070', 'weapon5': '0.086', 'AMMO3': '0.143', 'WEAPON4': '0.150', 'WEAPON5': '0.150', 'weapon4': '0.158', 'AMMO4': '0.179', 'DAMAGECOUNT': '0.192', 'HEALTH': '0.308', 'WEAPON3': '0.550', 'FRAGCOUNT': '1.000', 'weapon3': '1.194', 'weapon2': '1.456'} [2024-08-05 06:52:00,483][00034] Fps is (10 sec: 1638.4, 60 sec: 1228.8, 300 sec: 1166.3). Total num frames: 4702208. Throughput: 0: 288.0. Samples: 1175384. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 06:52:00,484][00034] Avg episode reward: [(0, '-3.802')] [2024-08-05 06:52:05,483][00034] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1138.5). Total num frames: 4702208. Throughput: 0: 287.3. Samples: 1177102. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 06:52:05,484][00034] Avg episode reward: [(0, '-3.802')] [2024-08-05 06:52:10,483][00034] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1138.5). Total num frames: 4710400. Throughput: 0: 285.4. Samples: 1178774. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 06:52:10,485][00034] Avg episode reward: [(0, '-3.802')] [2024-08-05 06:52:14,641][00139] DAMAGECOUNT value on done: 38283.0 [2024-08-05 06:52:14,642][00139] Sum rewards: -6.131, reward structure: {'DEATHCOUNT': '-9.750', 'HEALTH': '-2.752', 'AMMO5': '0.019', 'AMMO2': '0.020', 'ARMOR': '0.032', 'weapon5': '0.050', 'weapon4': '0.072', 'AMMO4': '0.101', 'HITCOUNT': '0.120', 'AMMO3': '0.152', 'WEAPON4': '0.250', 'DAMAGECOUNT': '0.285', 'WEAPON5': '0.300', 'WEAPON3': '0.950', 'FRAGCOUNT': '1.000', 'weapon2': '1.180', 'weapon3': '1.840'} [2024-08-05 06:52:14,869][00139] DAMAGECOUNT value on done: 38693.0 [2024-08-05 06:52:14,870][00139] Sum rewards: -8.518, reward structure: {'DEATHCOUNT': '-11.250', 'HEALTH': '-2.306', 'FRAGCOUNT': '-0.500', 'WEAPON1': '0.020', 'AMMO5': '0.024', 'AMMO2': '0.025', 'weapon5': '0.040', 'AMMO3': '0.097', 'ARMOR': '0.116', 'HITCOUNT': '0.120', 'AMMO4': '0.125', 'weapon4': '0.282', 'WEAPON4': '0.350', 'WEAPON5': '0.400', 'DAMAGECOUNT': '0.495', 'WEAPON3': '0.600', 'weapon3': '1.114', 'weapon2': '1.730'} [2024-08-05 06:52:15,483][00034] Fps is (10 sec: 1638.4, 60 sec: 1228.8, 300 sec: 1166.3). Total num frames: 4718592. Throughput: 0: 286.1. Samples: 1179690. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 06:52:15,484][00034] Avg episode reward: [(0, '-3.841')] [2024-08-05 06:52:20,483][00034] Fps is (10 sec: 1638.4, 60 sec: 1228.8, 300 sec: 1166.3). Total num frames: 4726784. Throughput: 0: 287.6. Samples: 1181449. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 06:52:20,484][00034] Avg episode reward: [(0, '-3.841')] [2024-08-05 06:52:25,483][00034] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1138.5). Total num frames: 4726784. Throughput: 0: 289.0. Samples: 1183220. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 06:52:25,484][00034] Avg episode reward: [(0, '-3.841')] [2024-08-05 06:52:29,096][00139] DAMAGECOUNT value on done: 38393.0 [2024-08-05 06:52:29,096][00139] Sum rewards: -6.304, reward structure: {'DEATHCOUNT': '-9.000', 'FRAGCOUNT': '-1.500', 'HEALTH': '-0.844', 'AMMO2': '0.007', 'AMMO5': '0.014', 'weapon5': '0.020', 'AMMO4': '0.036', 'HITCOUNT': '0.100', 'WEAPON5': '0.100', 'AMMO3': '0.142', 'DAMAGECOUNT': '0.330', 'ARMOR': '0.466', 'WEAPON3': '0.800', 'weapon2': '1.300', 'weapon3': '1.724'} [2024-08-05 06:52:29,325][00139] DAMAGECOUNT value on done: 38907.0 [2024-08-05 06:52:29,326][00139] Sum rewards: -3.002, reward structure: {'DEATHCOUNT': '-8.250', 'HEALTH': '-1.880', 'AMMO4': '-0.004', 'AMMO2': '-0.001', 'weapon5': '0.006', 'AMMO5': '0.007', 'ARMOR': '0.040', 'WEAPON4': '0.050', 'weapon4': '0.052', 'HITCOUNT': '0.150', 'WEAPON5': '0.150', 'AMMO3': '0.191', 'DAMAGECOUNT': '0.642', 'WEAPON3': '0.750', 'weapon2': '1.428', 'weapon3': '1.666', 'FRAGCOUNT': '2.000'} [2024-08-05 06:52:30,483][00034] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1166.3). Total num frames: 4734976. Throughput: 0: 288.3. Samples: 1184065. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 06:52:30,487][00034] Avg episode reward: [(0, '-3.930')] [2024-08-05 06:52:35,483][00034] Fps is (10 sec: 1638.4, 60 sec: 1228.8, 300 sec: 1166.3). Total num frames: 4743168. Throughput: 0: 288.9. Samples: 1185806. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 06:52:35,484][00034] Avg episode reward: [(0, '-3.930')] [2024-08-05 06:52:40,483][00034] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1138.5). Total num frames: 4743168. Throughput: 0: 289.3. Samples: 1187530. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 06:52:40,484][00034] Avg episode reward: [(0, '-3.930')] [2024-08-05 06:52:41,468][00138] Updated weights for policy 0, policy_version 580 (0.0018) [2024-08-05 06:52:43,791][00139] DAMAGECOUNT value on done: 38553.0 [2024-08-05 06:52:43,792][00139] Sum rewards: -3.460, reward structure: {'DEATHCOUNT': '-8.250', 'HEALTH': '-2.098', 'AMMO5': '0.005', 'weapon5': '0.008', 'WEAPON1': '0.020', 'ARMOR': '0.024', 'AMMO2': '0.039', 'weapon4': '0.050', 'AMMO3': '0.070', 'WEAPON5': '0.100', 'HITCOUNT': '0.110', 'AMMO4': '0.195', 'WEAPON4': '0.350', 'DAMAGECOUNT': '0.480', 'WEAPON3': '0.550', 'weapon3': '1.362', 'weapon2': '1.524', 'FRAGCOUNT': '2.000'} [2024-08-05 06:52:44,018][00139] DAMAGECOUNT value on done: 39062.0 [2024-08-05 06:52:44,019][00139] Sum rewards: -3.684, reward structure: {'DEATHCOUNT': '-9.000', 'HEALTH': '-0.910', 'AMMO5': '0.007', 'WEAPON1': '0.010', 'AMMO2': '0.034', 'weapon7': '0.034', 'AMMO3': '0.079', 'WEAPON5': '0.100', 'AMMO6': '0.100', 'WEAPON7': '0.100', 'AMMO7': '0.100', 'HITCOUNT': '0.150', 'AMMO4': '0.168', 'WEAPON4': '0.250', 'WEAPON3': '0.400', 'ARMOR': '0.440', 'DAMAGECOUNT': '0.465', 'weapon3': '0.476', 'weapon4': '0.640', 'FRAGCOUNT': '1.000', 'weapon2': '1.672'} [2024-08-05 06:52:45,483][00034] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1166.3). Total num frames: 4751360. Throughput: 0: 288.0. Samples: 1188346. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 06:52:45,485][00034] Avg episode reward: [(0, '-3.914')] [2024-08-05 06:52:50,483][00034] Fps is (10 sec: 1638.4, 60 sec: 1228.8, 300 sec: 1166.3). Total num frames: 4759552. Throughput: 0: 288.6. Samples: 1190090. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 06:52:50,484][00034] Avg episode reward: [(0, '-3.914')] [2024-08-05 06:52:55,483][00034] Fps is (10 sec: 1638.4, 60 sec: 1228.8, 300 sec: 1166.3). Total num frames: 4767744. Throughput: 0: 291.5. Samples: 1191892. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 06:52:55,484][00034] Avg episode reward: [(0, '-3.914')] [2024-08-05 06:52:57,980][00139] DAMAGECOUNT value on done: 38563.0 [2024-08-05 06:52:58,189][00139] DAMAGECOUNT value on done: 39164.0 [2024-08-05 06:52:58,189][00139] Sum rewards: -7.710, reward structure: {'DEATHCOUNT': '-11.250', 'HEALTH': '-0.932', 'FRAGCOUNT': '-0.500', 'AMMO5': '0.005', 'weapon5': '0.006', 'AMMO2': '0.017', 'WEAPON5': '0.050', 'weapon4': '0.058', 'ARMOR': '0.084', 'weapon7': '0.084', 'AMMO4': '0.085', 'AMMO6': '0.100', 'WEAPON7': '0.100', 'AMMO7': '0.100', 'WEAPON4': '0.100', 'HITCOUNT': '0.110', 'AMMO3': '0.139', 'DAMAGECOUNT': '0.306', 'WEAPON3': '0.750', 'weapon3': '1.340', 'weapon2': '1.538'} [2024-08-05 06:53:00,483][00034] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1138.5). Total num frames: 4767744. Throughput: 0: 290.9. Samples: 1192780. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 06:53:00,484][00034] Avg episode reward: [(0, '-4.033')] [2024-08-05 06:53:05,483][00034] Fps is (10 sec: 819.2, 60 sec: 1228.8, 300 sec: 1166.3). Total num frames: 4775936. Throughput: 0: 290.8. Samples: 1194537. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 06:53:05,485][00034] Avg episode reward: [(0, '-4.033')] [2024-08-05 06:53:10,483][00034] Fps is (10 sec: 1638.4, 60 sec: 1228.8, 300 sec: 1166.3). Total num frames: 4784128. Throughput: 0: 290.5. Samples: 1196293. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 06:53:10,484][00034] Avg episode reward: [(0, '-4.033')] [2024-08-05 06:53:12,507][00139] DAMAGECOUNT value on done: 38613.0 [2024-08-05 06:53:12,755][00139] DAMAGECOUNT value on done: 39304.0 [2024-08-05 06:53:12,755][00139] Sum rewards: -1.853, reward structure: {'DEATHCOUNT': '-7.500', 'HEALTH': '-0.216', 'AMMO5': '0.005', 'WEAPON1': '0.020', 'AMMO2': '0.041', 'ARMOR': '0.052', 'AMMO3': '0.083', 'HITCOUNT': '0.090', 'WEAPON5': '0.100', 'AMMO4': '0.206', 'weapon4': '0.264', 'WEAPON4': '0.400', 'DAMAGECOUNT': '0.420', 'WEAPON3': '0.450', 'FRAGCOUNT': '1.000', 'weapon3': '1.302', 'weapon2': '1.430'} [2024-08-05 06:53:15,483][00034] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1138.5). Total num frames: 4784128. Throughput: 0: 289.6. Samples: 1197096. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 06:53:15,484][00034] Avg episode reward: [(0, '-4.031')] [2024-08-05 06:53:20,483][00034] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1166.3). Total num frames: 4792320. Throughput: 0: 289.0. Samples: 1198810. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 06:53:20,484][00034] Avg episode reward: [(0, '-4.031')] [2024-08-05 06:53:25,483][00034] Fps is (10 sec: 1638.4, 60 sec: 1228.8, 300 sec: 1166.3). Total num frames: 4800512. Throughput: 0: 289.8. Samples: 1200569. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 06:53:25,485][00034] Avg episode reward: [(0, '-4.031')] [2024-08-05 06:53:27,066][00139] DAMAGECOUNT value on done: 38718.0 [2024-08-05 06:53:27,067][00139] Sum rewards: -7.374, reward structure: {'DEATHCOUNT': '-9.750', 'HEALTH': '-1.526', 'FRAGCOUNT': '-1.500', 'AMMO2': '0.011', 'ARMOR': '0.032', 'AMMO5': '0.037', 'weapon4': '0.044', 'WEAPON4': '0.050', 'AMMO4': '0.052', 'weapon5': '0.080', 'HITCOUNT': '0.100', 'AMMO3': '0.156', 'DAMAGECOUNT': '0.315', 'WEAPON5': '0.600', 'WEAPON3': '0.750', 'weapon2': '1.260', 'weapon3': '1.914'} [2024-08-05 06:53:27,302][00139] DAMAGECOUNT value on done: 39534.0 [2024-08-05 06:53:27,302][00139] Sum rewards: -9.862, reward structure: {'DEATHCOUNT': '-13.500', 'HEALTH': '-3.708', 'AMMO5': '0.005', 'AMMO2': '0.019', 'ARMOR': '0.020', 'weapon4': '0.036', 'WEAPON1': '0.040', 'AMMO4': '0.093', 'WEAPON5': '0.100', 'HITCOUNT': '0.230', 'AMMO3': '0.245', 'WEAPON4': '0.250', 'DAMAGECOUNT': '0.690', 'FRAGCOUNT': '1.000', 'weapon2': '1.214', 'WEAPON3': '1.450', 'weapon3': '1.954'} [2024-08-05 06:53:30,483][00034] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1138.5). Total num frames: 4800512. Throughput: 0: 291.2. Samples: 1201450. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 06:53:30,485][00034] Avg episode reward: [(0, '-4.170')] [2024-08-05 06:53:35,483][00034] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1138.5). Total num frames: 4808704. Throughput: 0: 291.2. Samples: 1203193. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 06:53:35,484][00034] Avg episode reward: [(0, '-4.170')] [2024-08-05 06:53:35,493][00132] Saving /kaggle/working/train_dir/default_experiment/checkpoint_p0/checkpoint_000000587_4808704.pth... [2024-08-05 06:53:35,568][00132] Removing /kaggle/working/train_dir/default_experiment/checkpoint_p0/checkpoint_000000553_4530176.pth [2024-08-05 06:53:40,483][00034] Fps is (10 sec: 1638.4, 60 sec: 1228.8, 300 sec: 1166.3). Total num frames: 4816896. Throughput: 0: 289.2. Samples: 1204904. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 06:53:40,484][00034] Avg episode reward: [(0, '-4.170')] [2024-08-05 06:53:41,623][00139] DAMAGECOUNT value on done: 38858.0 [2024-08-05 06:53:41,624][00139] Sum rewards: -2.338, reward structure: {'DEATHCOUNT': '-9.000', 'HEALTH': '-0.718', 'AMMO5': '0.005', 'weapon5': '0.016', 'AMMO2': '0.034', 'weapon7': '0.076', 'weapon4': '0.082', 'ARMOR': '0.092', 'AMMO6': '0.100', 'AMMO7': '0.100', 'WEAPON7': '0.100', 'WEAPON5': '0.100', 'HITCOUNT': '0.110', 'AMMO3': '0.124', 'WEAPON4': '0.150', 'AMMO4': '0.169', 'DAMAGECOUNT': '0.420', 'WEAPON3': '0.550', 'weapon3': '1.158', 'weapon2': '1.994', 'FRAGCOUNT': '2.000'} [2024-08-05 06:53:41,856][00139] DAMAGECOUNT value on done: 39736.0 [2024-08-05 06:53:41,856][00139] Sum rewards: -1.329, reward structure: {'DEATHCOUNT': '-6.750', 'HEALTH': '-0.066', 'AMMO5': '0.010', 'WEAPON1': '0.010', 'AMMO2': '0.024', 'weapon5': '0.032', 'ARMOR': '0.052', 'weapon4': '0.074', 'AMMO3': '0.109', 'AMMO4': '0.121', 'WEAPON4': '0.150', 'HITCOUNT': '0.170', 'WEAPON5': '0.200', 'FRAGCOUNT': '0.500', 'WEAPON3': '0.600', 'DAMAGECOUNT': '0.606', 'weapon2': '1.008', 'weapon3': '1.820'} [2024-08-05 06:53:45,483][00034] Fps is (10 sec: 1638.4, 60 sec: 1228.8, 300 sec: 1166.3). Total num frames: 4825088. Throughput: 0: 288.5. Samples: 1205764. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 06:53:45,484][00034] Avg episode reward: [(0, '-4.120')] [2024-08-05 06:53:50,483][00034] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1138.5). Total num frames: 4825088. Throughput: 0: 286.4. Samples: 1207425. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 06:53:50,484][00034] Avg episode reward: [(0, '-4.120')] [2024-08-05 06:53:52,244][00138] Updated weights for policy 0, policy_version 590 (0.0017) [2024-08-05 06:53:55,483][00034] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1166.3). Total num frames: 4833280. Throughput: 0: 286.3. Samples: 1209176. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 06:53:55,484][00034] Avg episode reward: [(0, '-4.120')] [2024-08-05 06:53:56,381][00139] DAMAGECOUNT value on done: 38898.0 [2024-08-05 06:53:56,619][00139] DAMAGECOUNT value on done: 39956.0 [2024-08-05 06:53:56,620][00139] Sum rewards: -13.241, reward structure: {'DEATHCOUNT': '-17.250', 'HEALTH': '-1.284', 'FRAGCOUNT': '-1.000', 'WEAPON1': '0.010', 'AMMO5': '0.015', 'AMMO2': '0.030', 'ARMOR': '0.036', 'weapon5': '0.058', 'weapon4': '0.130', 'AMMO4': '0.149', 'WEAPON4': '0.200', 'HITCOUNT': '0.200', 'AMMO3': '0.233', 'WEAPON5': '0.250', 'DAMAGECOUNT': '0.660', 'WEAPON3': '1.150', 'weapon3': '1.578', 'weapon2': '1.594'} [2024-08-05 06:54:00,483][00034] Fps is (10 sec: 1638.4, 60 sec: 1228.8, 300 sec: 1166.3). Total num frames: 4841472. Throughput: 0: 287.6. Samples: 1210037. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 06:54:00,485][00034] Avg episode reward: [(0, '-4.279')] [2024-08-05 06:54:05,483][00034] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1138.5). Total num frames: 4841472. Throughput: 0: 287.7. Samples: 1211755. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 06:54:05,485][00034] Avg episode reward: [(0, '-4.279')] [2024-08-05 06:54:10,483][00034] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1166.3). Total num frames: 4849664. Throughput: 0: 288.0. Samples: 1213531. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 06:54:10,485][00034] Avg episode reward: [(0, '-4.279')] [2024-08-05 06:54:10,838][00139] DAMAGECOUNT value on done: 38966.0 [2024-08-05 06:54:11,051][00139] DAMAGECOUNT value on done: 40126.0 [2024-08-05 06:54:11,052][00139] Sum rewards: -2.487, reward structure: {'DEATHCOUNT': '-7.500', 'HEALTH': '-1.256', 'AMMO5': '0.003', 'AMMO2': '0.012', 'WEAPON1': '0.020', 'weapon5': '0.058', 'AMMO4': '0.059', 'WEAPON5': '0.100', 'ARMOR': '0.104', 'HITCOUNT': '0.120', 'AMMO3': '0.140', 'WEAPON4': '0.300', 'weapon4': '0.306', 'DAMAGECOUNT': '0.510', 'WEAPON3': '0.750', 'FRAGCOUNT': '1.000', 'weapon3': '1.388', 'weapon2': '1.400'} [2024-08-05 06:54:15,483][00034] Fps is (10 sec: 1638.4, 60 sec: 1228.8, 300 sec: 1166.3). Total num frames: 4857856. Throughput: 0: 286.8. Samples: 1214357. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 06:54:15,485][00034] Avg episode reward: [(0, '-4.374')] [2024-08-05 06:54:20,483][00034] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1138.5). Total num frames: 4857856. Throughput: 0: 285.4. Samples: 1216034. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 06:54:20,484][00034] Avg episode reward: [(0, '-4.374')] [2024-08-05 06:54:25,483][00034] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1138.5). Total num frames: 4866048. Throughput: 0: 285.5. Samples: 1217751. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 06:54:25,484][00034] Avg episode reward: [(0, '-4.374')] [2024-08-05 06:54:25,766][00139] DAMAGECOUNT value on done: 39061.0 [2024-08-05 06:54:25,766][00139] Sum rewards: -3.609, reward structure: {'DEATHCOUNT': '-7.500', 'HEALTH': '-2.490', 'AMMO4': '-0.005', 'AMMO2': '-0.001', 'AMMO5': '0.005', 'ARMOR': '0.020', 'AMMO3': '0.072', 'HITCOUNT': '0.080', 'WEAPON4': '0.200', 'weapon4': '0.278', 'DAMAGECOUNT': '0.285', 'WEAPON3': '0.500', 'weapon3': '0.880', 'FRAGCOUNT': '2.000', 'weapon2': '2.066'} [2024-08-05 06:54:26,000][00139] DAMAGECOUNT value on done: 40366.0 [2024-08-05 06:54:26,000][00139] Sum rewards: -0.645, reward structure: {'DEATHCOUNT': '-6.750', 'HEALTH': '-2.180', 'AMMO2': '0.010', 'weapon4': '0.022', 'AMMO4': '0.050', 'AMMO3': '0.108', 'HITCOUNT': '0.180', 'WEAPON4': '0.200', 'WEAPON3': '0.650', 'DAMAGECOUNT': '0.720', 'weapon3': '1.620', 'weapon2': '1.724', 'FRAGCOUNT': '3.000'} [2024-08-05 06:54:30,483][00034] Fps is (10 sec: 1638.4, 60 sec: 1228.8, 300 sec: 1166.3). Total num frames: 4874240. Throughput: 0: 285.8. Samples: 1218625. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 06:54:30,485][00034] Avg episode reward: [(0, '-4.314')] [2024-08-05 06:54:35,483][00034] Fps is (10 sec: 1638.4, 60 sec: 1228.8, 300 sec: 1166.3). Total num frames: 4882432. Throughput: 0: 289.2. Samples: 1220437. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 06:54:35,485][00034] Avg episode reward: [(0, '-4.314')] [2024-08-05 06:54:40,073][00139] DAMAGECOUNT value on done: 39123.0 [2024-08-05 06:54:40,289][00139] DAMAGECOUNT value on done: 40491.0 [2024-08-05 06:54:40,483][00034] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1138.5). Total num frames: 4882432. Throughput: 0: 288.3. Samples: 1222149. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 06:54:40,484][00034] Avg episode reward: [(0, '-4.345')] [2024-08-05 06:54:45,483][00034] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1166.3). Total num frames: 4890624. Throughput: 0: 288.3. Samples: 1223010. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 06:54:45,484][00034] Avg episode reward: [(0, '-4.345')] [2024-08-05 06:54:50,483][00034] Fps is (10 sec: 1638.4, 60 sec: 1228.8, 300 sec: 1166.3). Total num frames: 4898816. Throughput: 0: 288.0. Samples: 1224714. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 06:54:50,484][00034] Avg episode reward: [(0, '-4.345')] [2024-08-05 06:54:54,646][00139] DAMAGECOUNT value on done: 39254.0 [2024-08-05 06:54:54,646][00139] Sum rewards: -2.688, reward structure: {'DEATHCOUNT': '-9.000', 'HEALTH': '-0.081', 'AMMO5': '0.010', 'AMMO2': '0.017', 'WEAPON1': '0.020', 'ARMOR': '0.048', 'weapon5': '0.058', 'AMMO3': '0.065', 'AMMO4': '0.086', 'HITCOUNT': '0.120', 'weapon4': '0.122', 'WEAPON4': '0.150', 'WEAPON5': '0.200', 'AMMO6': '0.200', 'WEAPON7': '0.200', 'AMMO7': '0.200', 'WEAPON3': '0.350', 'DAMAGECOUNT': '0.393', 'FRAGCOUNT': '1.000', 'weapon3': '1.178', 'weapon2': '1.976'} [2024-08-05 06:54:54,875][00139] DAMAGECOUNT value on done: 40626.0 [2024-08-05 06:54:54,875][00139] Sum rewards: -4.519, reward structure: {'DEATHCOUNT': '-11.250', 'HEALTH': '-0.221', 'weapon5': '0.008', 'AMMO5': '0.012', 'AMMO2': '0.024', 'ARMOR': '0.064', 'weapon4': '0.076', 'HITCOUNT': '0.100', 'AMMO4': '0.119', 'AMMO3': '0.120', 'WEAPON5': '0.150', 'WEAPON4': '0.250', 'DAMAGECOUNT': '0.405', 'WEAPON3': '0.450', 'weapon2': '1.400', 'weapon3': '1.774', 'FRAGCOUNT': '2.000'} [2024-08-05 06:54:55,483][00034] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1138.5). Total num frames: 4898816. Throughput: 0: 287.6. Samples: 1226471. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 06:54:55,484][00034] Avg episode reward: [(0, '-4.340')] [2024-08-05 06:55:00,483][00034] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1138.5). Total num frames: 4907008. Throughput: 0: 288.5. Samples: 1227341. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 06:55:00,484][00034] Avg episode reward: [(0, '-4.340')] [2024-08-05 06:55:03,172][00138] Updated weights for policy 0, policy_version 600 (0.0017) [2024-08-05 06:55:05,483][00034] Fps is (10 sec: 1638.4, 60 sec: 1228.8, 300 sec: 1166.3). Total num frames: 4915200. Throughput: 0: 290.2. Samples: 1229091. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 06:55:05,484][00034] Avg episode reward: [(0, '-4.340')] [2024-08-05 06:55:08,998][00139] DAMAGECOUNT value on done: 39609.0 [2024-08-05 06:55:08,999][00139] Sum rewards: -4.596, reward structure: {'DEATHCOUNT': '-10.500', 'HEALTH': '-1.173', 'AMMO5': '0.009', 'AMMO2': '0.017', 'WEAPON1': '0.020', 'weapon5': '0.070', 'AMMO4': '0.082', 'AMMO3': '0.161', 'WEAPON5': '0.200', 'weapon4': '0.232', 'HITCOUNT': '0.240', 'WEAPON4': '0.250', 'ARMOR': '0.473', 'FRAGCOUNT': '0.500', 'WEAPON3': '0.850', 'DAMAGECOUNT': '1.065', 'weapon3': '1.100', 'weapon2': '1.808'} [2024-08-05 06:55:09,100][00139] Large shaping reward -2.534 for [('FRAGCOUNT', -1.5, -1.0), ('DEATHCOUNT', -0.75, 1.0), ('HEALTH', -0.28500000000000003, -95.0), ('AMMO5', -0.0005, -1.0), ('weapon5', 0.002)] [2024-08-05 06:55:09,221][00139] DAMAGECOUNT value on done: 40801.0 [2024-08-05 06:55:09,222][00139] Sum rewards: -4.749, reward structure: {'DEATHCOUNT': '-9.750', 'HEALTH': '-1.654', 'AMMO2': '0.000', 'AMMO4': '0.000', 'AMMO5': '0.017', 'WEAPON1': '0.020', 'ARMOR': '0.028', 'weapon5': '0.060', 'WEAPON4': '0.100', 'AMMO3': '0.108', 'weapon4': '0.114', 'HITCOUNT': '0.130', 'WEAPON5': '0.300', 'DAMAGECOUNT': '0.525', 'WEAPON3': '0.700', 'weapon2': '1.178', 'FRAGCOUNT': '1.500', 'weapon3': '1.874'} [2024-08-05 06:55:10,483][00034] Fps is (10 sec: 1638.4, 60 sec: 1228.8, 300 sec: 1166.3). Total num frames: 4923392. Throughput: 0: 291.2. Samples: 1230857. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 06:55:10,484][00034] Avg episode reward: [(0, '-4.400')] [2024-08-05 06:55:15,483][00034] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1138.5). Total num frames: 4923392. Throughput: 0: 291.1. Samples: 1231725. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 06:55:15,484][00034] Avg episode reward: [(0, '-4.400')] [2024-08-05 06:55:20,483][00034] Fps is (10 sec: 819.2, 60 sec: 1228.8, 300 sec: 1166.3). Total num frames: 4931584. Throughput: 0: 287.7. Samples: 1233383. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 06:55:20,484][00034] Avg episode reward: [(0, '-4.400')] [2024-08-05 06:55:23,941][00139] DAMAGECOUNT value on done: 39708.0 [2024-08-05 06:55:23,942][00139] Sum rewards: 0.342, reward structure: {'DEATHCOUNT': '-5.250', 'HEALTH': '-1.005', 'AMMO5': '0.003', 'AMMO2': '0.014', 'ARMOR': '0.060', 'weapon5': '0.060', 'AMMO4': '0.071', 'AMMO3': '0.100', 'WEAPON5': '0.100', 'HITCOUNT': '0.110', 'WEAPON4': '0.200', 'weapon4': '0.202', 'DAMAGECOUNT': '0.297', 'WEAPON3': '0.600', 'weapon2': '1.268', 'weapon3': '1.512', 'FRAGCOUNT': '2.000'} [2024-08-05 06:55:24,174][00139] DAMAGECOUNT value on done: 40806.0 [2024-08-05 06:55:24,175][00139] Sum rewards: -5.091, reward structure: {'DEATHCOUNT': '-6.000', 'HEALTH': '-2.078', 'FRAGCOUNT': '-1.500', 'AMMO5': '0.009', 'HITCOUNT': '0.010', 'AMMO2': '0.013', 'weapon5': '0.014', 'DAMAGECOUNT': '0.015', 'WEAPON1': '0.020', 'AMMO4': '0.067', 'AMMO3': '0.106', 'WEAPON5': '0.200', 'WEAPON4': '0.300', 'weapon4': '0.310', 'ARMOR': '0.576', 'WEAPON3': '0.700', 'weapon2': '1.068', 'weapon3': '1.078'} [2024-08-05 06:55:25,483][00034] Fps is (10 sec: 1638.4, 60 sec: 1228.8, 300 sec: 1166.3). Total num frames: 4939776. Throughput: 0: 286.6. Samples: 1235045. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 06:55:25,485][00034] Avg episode reward: [(0, '-4.371')] [2024-08-05 06:55:30,483][00034] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1138.5). Total num frames: 4939776. Throughput: 0: 287.0. Samples: 1235925. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 06:55:30,485][00034] Avg episode reward: [(0, '-4.371')] [2024-08-05 06:55:35,483][00034] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1166.3). Total num frames: 4947968. Throughput: 0: 287.2. Samples: 1237640. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 06:55:35,484][00034] Avg episode reward: [(0, '-4.371')] [2024-08-05 06:55:35,494][00132] Saving /kaggle/working/train_dir/default_experiment/checkpoint_p0/checkpoint_000000604_4947968.pth... [2024-08-05 06:55:35,585][00132] Removing /kaggle/working/train_dir/default_experiment/checkpoint_p0/checkpoint_000000570_4669440.pth [2024-08-05 06:55:38,713][00139] DAMAGECOUNT value on done: 39953.0 [2024-08-05 06:55:38,713][00139] Sum rewards: -5.054, reward structure: {'DEATHCOUNT': '-11.250', 'HEALTH': '-1.444', 'weapon5': '0.006', 'AMMO5': '0.007', 'AMMO2': '0.015', 'ARMOR': '0.024', 'AMMO4': '0.074', 'weapon4': '0.144', 'WEAPON4': '0.150', 'WEAPON5': '0.150', 'AMMO3': '0.179', 'HITCOUNT': '0.190', 'DAMAGECOUNT': '0.735', 'WEAPON3': '0.900', 'weapon2': '1.362', 'weapon3': '1.704', 'FRAGCOUNT': '2.000'} [2024-08-05 06:55:38,966][00139] DAMAGECOUNT value on done: 40956.0 [2024-08-05 06:55:38,966][00139] Sum rewards: -6.123, reward structure: {'DEATHCOUNT': '-12.000', 'HEALTH': '-1.550', 'AMMO5': '0.003', 'AMMO2': '0.014', 'WEAPON5': '0.050', 'AMMO4': '0.068', 'WEAPON4': '0.100', 'ARMOR': '0.123', 'HITCOUNT': '0.130', 'AMMO3': '0.201', 'DAMAGECOUNT': '0.450', 'WEAPON3': '1.050', 'weapon2': '1.276', 'weapon3': '1.962', 'FRAGCOUNT': '2.000'} [2024-08-05 06:55:40,483][00034] Fps is (10 sec: 1638.4, 60 sec: 1228.8, 300 sec: 1166.3). Total num frames: 4956160. Throughput: 0: 285.6. Samples: 1239323. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 06:55:40,484][00034] Avg episode reward: [(0, '-4.305')] [2024-08-05 06:55:45,483][00034] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1138.5). Total num frames: 4956160. Throughput: 0: 286.2. Samples: 1240222. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 06:55:45,485][00034] Avg episode reward: [(0, '-4.305')] [2024-08-05 06:55:50,483][00034] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1138.5). Total num frames: 4964352. Throughput: 0: 286.8. Samples: 1241997. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 06:55:50,485][00034] Avg episode reward: [(0, '-4.305')] [2024-08-05 06:55:53,168][00139] DAMAGECOUNT value on done: 39963.0 [2024-08-05 06:55:53,456][00139] DAMAGECOUNT value on done: 41182.0 [2024-08-05 06:55:53,457][00139] Sum rewards: -3.060, reward structure: {'DEATHCOUNT': '-9.000', 'HEALTH': '-1.350', 'AMMO5': '0.010', 'AMMO2': '0.018', 'ARMOR': '0.035', 'weapon4': '0.044', 'HITCOUNT': '0.070', 'AMMO4': '0.089', 'weapon5': '0.118', 'AMMO3': '0.132', 'WEAPON5': '0.150', 'WEAPON4': '0.150', 'WEAPON3': '0.650', 'DAMAGECOUNT': '0.678', 'weapon3': '1.492', 'weapon2': '1.654', 'FRAGCOUNT': '2.000'} [2024-08-05 06:55:55,483][00034] Fps is (10 sec: 1638.4, 60 sec: 1228.8, 300 sec: 1166.3). Total num frames: 4972544. Throughput: 0: 284.9. Samples: 1243676. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 06:55:55,484][00034] Avg episode reward: [(0, '-4.356')] [2024-08-05 06:56:00,483][00034] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1138.6). Total num frames: 4972544. Throughput: 0: 284.7. Samples: 1244538. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 06:56:00,484][00034] Avg episode reward: [(0, '-4.356')] [2024-08-05 06:56:05,483][00034] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1138.5). Total num frames: 4980736. Throughput: 0: 286.0. Samples: 1246252. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 06:56:05,484][00034] Avg episode reward: [(0, '-4.356')] [2024-08-05 06:56:07,858][00139] DAMAGECOUNT value on done: 40083.0 [2024-08-05 06:56:07,858][00139] Sum rewards: -6.679, reward structure: {'DEATHCOUNT': '-10.500', 'HEALTH': '-1.981', 'AMMO2': '0.028', 'HITCOUNT': '0.080', 'ARMOR': '0.108', 'AMMO4': '0.139', 'weapon4': '0.152', 'AMMO3': '0.163', 'WEAPON4': '0.250', 'DAMAGECOUNT': '0.360', 'WEAPON3': '0.900', 'FRAGCOUNT': '1.000', 'weapon2': '1.094', 'weapon3': '1.528'} [2024-08-05 06:56:08,070][00139] DAMAGECOUNT value on done: 41498.0 [2024-08-05 06:56:08,071][00139] Sum rewards: -4.277, reward structure: {'DEATHCOUNT': '-9.750', 'HEALTH': '-2.205', 'AMMO5': '0.007', 'AMMO2': '0.008', 'WEAPON1': '0.020', 'AMMO4': '0.037', 'ARMOR': '0.048', 'weapon5': '0.070', 'WEAPON4': '0.100', 'AMMO3': '0.137', 'HITCOUNT': '0.160', 'WEAPON5': '0.200', 'weapon4': '0.202', 'WEAPON3': '0.800', 'DAMAGECOUNT': '0.948', 'weapon2': '1.464', 'weapon3': '1.476', 'FRAGCOUNT': '2.000'} [2024-08-05 06:56:10,483][00034] Fps is (10 sec: 1638.4, 60 sec: 1092.3, 300 sec: 1166.3). Total num frames: 4988928. Throughput: 0: 287.6. Samples: 1247985. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 06:56:10,485][00034] Avg episode reward: [(0, '-4.402')] [2024-08-05 06:56:14,521][00138] Updated weights for policy 0, policy_version 610 (0.0017) [2024-08-05 06:56:15,483][00034] Fps is (10 sec: 1638.4, 60 sec: 1228.8, 300 sec: 1166.3). Total num frames: 4997120. Throughput: 0: 288.0. Samples: 1248887. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 06:56:15,484][00034] Avg episode reward: [(0, '-4.402')] [2024-08-05 06:56:20,483][00034] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1138.5). Total num frames: 4997120. Throughput: 0: 288.7. Samples: 1250633. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 06:56:20,484][00034] Avg episode reward: [(0, '-4.402')] [2024-08-05 06:56:22,382][00139] DAMAGECOUNT value on done: 40318.0 [2024-08-05 06:56:22,382][00139] Sum rewards: -1.529, reward structure: {'DEATHCOUNT': '-6.000', 'HEALTH': '-1.238', 'AMMO4': '-0.006', 'AMMO2': '-0.001', 'AMMO5': '0.013', 'weapon5': '0.042', 'HITCOUNT': '0.130', 'AMMO3': '0.131', 'WEAPON5': '0.150', 'FRAGCOUNT': '0.500', 'ARMOR': '0.504', 'WEAPON3': '0.700', 'DAMAGECOUNT': '0.705', 'weapon2': '1.006', 'weapon3': '1.836'} [2024-08-05 06:56:22,628][00139] DAMAGECOUNT value on done: 41557.0 [2024-08-05 06:56:22,629][00139] Sum rewards: -3.937, reward structure: {'DEATHCOUNT': '-7.500', 'HEALTH': '-1.917', 'AMMO4': '-0.031', 'AMMO2': '-0.006', 'AMMO5': '0.005', 'WEAPON1': '0.010', 'HITCOUNT': '0.050', 'ARMOR': '0.089', 'WEAPON5': '0.100', 'AMMO3': '0.134', 'DAMAGECOUNT': '0.177', 'WEAPON3': '0.750', 'FRAGCOUNT': '1.000', 'weapon3': '1.518', 'weapon2': '1.684'} [2024-08-05 06:56:25,483][00034] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1138.5). Total num frames: 5005312. Throughput: 0: 289.2. Samples: 1252335. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 06:56:25,485][00034] Avg episode reward: [(0, '-4.425')] [2024-08-05 06:56:30,483][00034] Fps is (10 sec: 1638.4, 60 sec: 1228.8, 300 sec: 1166.3). Total num frames: 5013504. Throughput: 0: 289.0. Samples: 1253225. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 06:56:30,484][00034] Avg episode reward: [(0, '-4.425')] [2024-08-05 06:56:35,483][00034] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1138.5). Total num frames: 5013504. Throughput: 0: 287.0. Samples: 1254910. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 06:56:35,485][00034] Avg episode reward: [(0, '-4.425')] [2024-08-05 06:56:37,043][00139] DAMAGECOUNT value on done: 40373.0 [2024-08-05 06:56:37,267][00139] DAMAGECOUNT value on done: 41719.0 [2024-08-05 06:56:37,268][00139] Sum rewards: -12.247, reward structure: {'DEATHCOUNT': '-13.500', 'HEALTH': '-4.190', 'FRAGCOUNT': '-0.500', 'AMMO2': '0.005', 'WEAPON1': '0.010', 'AMMO4': '0.023', 'AMMO5': '0.032', 'weapon5': '0.110', 'HITCOUNT': '0.120', 'AMMO3': '0.167', 'WEAPON4': '0.250', 'weapon4': '0.484', 'DAMAGECOUNT': '0.486', 'WEAPON5': '0.600', 'WEAPON3': '0.900', 'weapon3': '1.348', 'weapon2': '1.408'} [2024-08-05 06:56:40,483][00034] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1138.5). Total num frames: 5021696. Throughput: 0: 287.7. Samples: 1256624. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 06:56:40,484][00034] Avg episode reward: [(0, '-4.598')] [2024-08-05 06:56:45,483][00034] Fps is (10 sec: 1638.4, 60 sec: 1228.8, 300 sec: 1166.3). Total num frames: 5029888. Throughput: 0: 287.7. Samples: 1257483. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 06:56:45,484][00034] Avg episode reward: [(0, '-4.598')] [2024-08-05 06:56:50,483][00034] Fps is (10 sec: 1638.4, 60 sec: 1228.8, 300 sec: 1166.3). Total num frames: 5038080. Throughput: 0: 288.1. Samples: 1259215. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 06:56:50,484][00034] Avg episode reward: [(0, '-4.598')] [2024-08-05 06:56:51,555][00139] DAMAGECOUNT value on done: 40714.0 [2024-08-05 06:56:51,556][00139] Sum rewards: -3.824, reward structure: {'DEATHCOUNT': '-10.500', 'HEALTH': '-1.610', 'AMMO2': '0.004', 'AMMO5': '0.007', 'WEAPON1': '0.010', 'weapon5': '0.014', 'AMMO4': '0.021', 'weapon4': '0.084', 'WEAPON4': '0.100', 'WEAPON5': '0.150', 'AMMO3': '0.179', 'HITCOUNT': '0.310', 'ARMOR': '0.515', 'DAMAGECOUNT': '1.023', 'WEAPON3': '1.100', 'weapon2': '1.162', 'FRAGCOUNT': '1.500', 'weapon3': '2.106'} [2024-08-05 06:56:51,781][00139] DAMAGECOUNT value on done: 41972.0 [2024-08-05 06:56:51,781][00139] Sum rewards: -4.718, reward structure: {'DEATHCOUNT': '-9.000', 'HEALTH': '-2.020', 'weapon5': '0.004', 'AMMO5': '0.007', 'AMMO2': '0.022', 'WEAPON5': '0.050', 'ARMOR': '0.056', 'AMMO4': '0.110', 'AMMO3': '0.115', 'weapon7': '0.144', 'HITCOUNT': '0.150', 'AMMO6': '0.220', 'AMMO7': '0.220', 'WEAPON7': '0.300', 'WEAPON4': '0.350', 'FRAGCOUNT': '0.500', 'weapon4': '0.630', 'WEAPON3': '0.700', 'DAMAGECOUNT': '0.759', 'weapon3': '0.830', 'weapon2': '1.134'} [2024-08-05 06:56:55,483][00034] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1138.5). Total num frames: 5038080. Throughput: 0: 287.7. Samples: 1260931. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 06:56:55,485][00034] Avg episode reward: [(0, '-4.570')] [2024-08-05 06:57:00,483][00034] Fps is (10 sec: 819.2, 60 sec: 1228.8, 300 sec: 1166.3). Total num frames: 5046272. Throughput: 0: 287.1. Samples: 1261807. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 06:57:00,484][00034] Avg episode reward: [(0, '-4.570')] [2024-08-05 06:57:05,483][00034] Fps is (10 sec: 1638.4, 60 sec: 1228.8, 300 sec: 1166.3). Total num frames: 5054464. Throughput: 0: 287.2. Samples: 1263557. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 06:57:05,484][00034] Avg episode reward: [(0, '-4.570')] [2024-08-05 06:57:06,141][00139] DAMAGECOUNT value on done: 40917.0 [2024-08-05 06:57:06,141][00139] Sum rewards: -4.050, reward structure: {'DEATHCOUNT': '-9.000', 'FRAGCOUNT': '-0.500', 'HEALTH': '-0.108', 'AMMO5': '0.005', 'WEAPON1': '0.010', 'AMMO2': '0.019', 'weapon5': '0.086', 'AMMO4': '0.095', 'ARMOR': '0.096', 'WEAPON5': '0.100', 'AMMO3': '0.118', 'WEAPON4': '0.150', 'HITCOUNT': '0.170', 'weapon4': '0.232', 'DAMAGECOUNT': '0.609', 'WEAPON3': '0.650', 'weapon3': '1.558', 'weapon2': '1.660'} [2024-08-05 06:57:06,380][00139] DAMAGECOUNT value on done: 42262.0 [2024-08-05 06:57:06,381][00139] Sum rewards: -4.817, reward structure: {'DEATHCOUNT': '-10.500', 'HEALTH': '-1.602', 'AMMO5': '0.013', 'AMMO2': '0.034', 'weapon5': '0.038', 'ARMOR': '0.060', 'WEAPON5': '0.150', 'AMMO3': '0.157', 'AMMO4': '0.168', 'HITCOUNT': '0.220', 'WEAPON4': '0.400', 'weapon4': '0.442', 'WEAPON3': '0.850', 'DAMAGECOUNT': '0.870', 'FRAGCOUNT': '1.000', 'weapon3': '1.394', 'weapon2': '1.490'} [2024-08-05 06:57:10,483][00034] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1138.5). Total num frames: 5054464. Throughput: 0: 287.4. Samples: 1265269. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 06:57:10,485][00034] Avg episode reward: [(0, '-4.560')] [2024-08-05 06:57:15,483][00034] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1138.5). Total num frames: 5062656. Throughput: 0: 287.0. Samples: 1266142. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 06:57:15,485][00034] Avg episode reward: [(0, '-4.560')] [2024-08-05 06:57:20,483][00034] Fps is (10 sec: 1638.4, 60 sec: 1228.8, 300 sec: 1166.3). Total num frames: 5070848. Throughput: 0: 287.7. Samples: 1267856. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 06:57:20,484][00034] Avg episode reward: [(0, '-4.560')] [2024-08-05 06:57:20,706][00139] DAMAGECOUNT value on done: 41052.0 [2024-08-05 06:57:20,706][00139] Sum rewards: -2.382, reward structure: {'DEATHCOUNT': '-7.500', 'HEALTH': '-1.142', 'AMMO5': '0.003', 'AMMO2': '0.017', 'WEAPON1': '0.020', 'ARMOR': '0.040', 'weapon5': '0.052', 'AMMO4': '0.087', 'AMMO3': '0.100', 'WEAPON5': '0.100', 'HITCOUNT': '0.140', 'weapon4': '0.162', 'WEAPON4': '0.200', 'DAMAGECOUNT': '0.405', 'WEAPON3': '0.700', 'weapon2': '0.876', 'weapon3': '1.358', 'FRAGCOUNT': '2.000'} [2024-08-05 06:57:20,951][00139] DAMAGECOUNT value on done: 42552.0 [2024-08-05 06:57:20,952][00139] Sum rewards: -5.361, reward structure: {'DEATHCOUNT': '-12.000', 'HEALTH': '-1.460', 'AMMO2': '0.002', 'AMMO4': '0.011', 'AMMO5': '0.012', 'weapon5': '0.014', 'weapon4': '0.032', 'ARMOR': '0.040', 'WEAPON4': '0.050', 'AMMO3': '0.181', 'HITCOUNT': '0.230', 'WEAPON5': '0.250', 'DAMAGECOUNT': '0.870', 'WEAPON3': '1.050', 'weapon2': '1.080', 'FRAGCOUNT': '2.000', 'weapon3': '2.276'} [2024-08-05 06:57:25,483][00034] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1138.5). Total num frames: 5070848. Throughput: 0: 288.0. Samples: 1269584. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 06:57:25,486][00034] Avg episode reward: [(0, '-4.623')] [2024-08-05 06:57:25,919][00138] Updated weights for policy 0, policy_version 620 (0.0019) [2024-08-05 06:57:30,483][00034] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1138.5). Total num frames: 5079040. Throughput: 0: 285.6. Samples: 1270334. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 06:57:30,484][00034] Avg episode reward: [(0, '-4.623')] [2024-08-05 06:57:35,483][00034] Fps is (10 sec: 1638.4, 60 sec: 1228.8, 300 sec: 1166.3). Total num frames: 5087232. Throughput: 0: 286.2. Samples: 1272096. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 06:57:35,484][00034] Avg episode reward: [(0, '-4.623')] [2024-08-05 06:57:35,492][00132] Saving /kaggle/working/train_dir/default_experiment/checkpoint_p0/checkpoint_000000621_5087232.pth... [2024-08-05 06:57:35,563][00132] Removing /kaggle/working/train_dir/default_experiment/checkpoint_p0/checkpoint_000000587_4808704.pth [2024-08-05 06:57:35,612][00139] DAMAGECOUNT value on done: 41597.0 [2024-08-05 06:57:35,612][00139] Sum rewards: 0.411, reward structure: {'DEATHCOUNT': '-9.000', 'HEALTH': '-2.420', 'AMMO2': '0.011', 'AMMO5': '0.015', 'weapon5': '0.022', 'WEAPON4': '0.050', 'AMMO4': '0.052', 'AMMO3': '0.136', 'WEAPON5': '0.200', 'weapon4': '0.238', 'HITCOUNT': '0.360', 'WEAPON3': '1.000', 'weapon2': '1.110', 'DAMAGECOUNT': '1.635', 'weapon3': '2.002', 'FRAGCOUNT': '5.000'} [2024-08-05 06:57:35,844][00139] DAMAGECOUNT value on done: 42899.0 [2024-08-05 06:57:35,844][00139] Sum rewards: 0.452, reward structure: {'DEATHCOUNT': '-8.250', 'HEALTH': '-0.236', 'AMMO5': '0.007', 'WEAPON1': '0.010', 'ARMOR': '0.033', 'AMMO2': '0.036', 'weapon5': '0.054', 'AMMO3': '0.094', 'WEAPON5': '0.150', 'AMMO4': '0.179', 'HITCOUNT': '0.240', 'weapon4': '0.448', 'WEAPON4': '0.450', 'WEAPON3': '0.500', 'DAMAGECOUNT': '1.041', 'weapon2': '1.318', 'weapon3': '1.378', 'FRAGCOUNT': '3.000'} [2024-08-05 06:57:40,483][00034] Fps is (10 sec: 1638.4, 60 sec: 1228.8, 300 sec: 1166.3). Total num frames: 5095424. Throughput: 0: 286.6. Samples: 1273830. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 06:57:40,484][00034] Avg episode reward: [(0, '-4.514')] [2024-08-05 06:57:45,483][00034] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1138.5). Total num frames: 5095424. Throughput: 0: 286.2. Samples: 1274684. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 06:57:45,484][00034] Avg episode reward: [(0, '-4.514')] [2024-08-05 06:57:50,292][00139] DAMAGECOUNT value on done: 41943.0 [2024-08-05 06:57:50,293][00139] Sum rewards: -7.745, reward structure: {'DEATHCOUNT': '-12.000', 'HEALTH': '-2.910', 'AMMO5': '0.005', 'WEAPON1': '0.020', 'weapon5': '0.034', 'AMMO2': '0.037', 'ARMOR': '0.037', 'WEAPON5': '0.050', 'AMMO4': '0.184', 'HITCOUNT': '0.200', 'AMMO3': '0.202', 'WEAPON4': '0.400', 'weapon4': '0.496', 'FRAGCOUNT': '0.500', 'DAMAGECOUNT': '1.038', 'WEAPON3': '1.100', 'weapon2': '1.142', 'weapon3': '1.720'} [2024-08-05 06:57:50,483][00034] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1138.5). Total num frames: 5103616. Throughput: 0: 285.3. Samples: 1276396. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 06:57:50,484][00034] Avg episode reward: [(0, '-4.550')] [2024-08-05 06:57:50,536][00139] DAMAGECOUNT value on done: 43149.0 [2024-08-05 06:57:50,536][00139] Sum rewards: 2.325, reward structure: {'DEATHCOUNT': '-5.250', 'HEALTH': '-0.420', 'AMMO5': '0.005', 'AMMO2': '0.011', 'weapon5': '0.014', 'WEAPON1': '0.020', 'WEAPON4': '0.050', 'AMMO4': '0.055', 'AMMO3': '0.094', 'WEAPON5': '0.100', 'weapon4': '0.128', 'HITCOUNT': '0.250', 'WEAPON3': '0.500', 'DAMAGECOUNT': '0.750', 'weapon2': '1.296', 'weapon3': '1.722', 'FRAGCOUNT': '3.000'} [2024-08-05 06:57:55,483][00034] Fps is (10 sec: 1638.4, 60 sec: 1228.8, 300 sec: 1166.3). Total num frames: 5111808. Throughput: 0: 285.7. Samples: 1278126. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 06:57:55,484][00034] Avg episode reward: [(0, '-4.470')] [2024-08-05 06:58:00,483][00034] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1138.5). Total num frames: 5111808. Throughput: 0: 284.8. Samples: 1278959. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 06:58:00,486][00034] Avg episode reward: [(0, '-4.470')] [2024-08-05 06:58:04,969][00139] DAMAGECOUNT value on done: 42320.0 [2024-08-05 06:58:04,970][00139] Sum rewards: -3.078, reward structure: {'DEATHCOUNT': '-12.000', 'HEALTH': '-0.248', 'weapon5': '0.016', 'AMMO5': '0.017', 'AMMO2': '0.018', 'ARMOR': '0.024', 'weapon4': '0.070', 'AMMO4': '0.091', 'WEAPON4': '0.100', 'AMMO3': '0.124', 'HITCOUNT': '0.240', 'WEAPON5': '0.350', 'WEAPON3': '0.750', 'DAMAGECOUNT': '1.131', 'weapon2': '1.522', 'weapon3': '1.716', 'FRAGCOUNT': '3.000'} [2024-08-05 06:58:05,194][00139] DAMAGECOUNT value on done: 43242.0 [2024-08-05 06:58:05,483][00034] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1138.5). Total num frames: 5120000. Throughput: 0: 285.4. Samples: 1280698. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 06:58:05,484][00034] Avg episode reward: [(0, '-4.451')] [2024-08-05 06:58:10,483][00034] Fps is (10 sec: 1638.4, 60 sec: 1228.8, 300 sec: 1166.3). Total num frames: 5128192. Throughput: 0: 284.7. Samples: 1282396. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 06:58:10,485][00034] Avg episode reward: [(0, '-4.451')] [2024-08-05 06:58:15,483][00034] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1138.5). Total num frames: 5128192. Throughput: 0: 288.2. Samples: 1283304. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 06:58:15,485][00034] Avg episode reward: [(0, '-4.451')] [2024-08-05 06:58:19,583][00139] DAMAGECOUNT value on done: 42440.0 [2024-08-05 06:58:19,584][00139] Sum rewards: -5.225, reward structure: {'DEATHCOUNT': '-10.500', 'HEALTH': '-1.414', 'AMMO2': '0.004', 'AMMO5': '0.010', 'AMMO4': '0.019', 'HITCOUNT': '0.110', 'weapon5': '0.142', 'AMMO3': '0.162', 'WEAPON5': '0.250', 'DAMAGECOUNT': '0.360', 'ARMOR': '0.436', 'WEAPON3': '0.950', 'FRAGCOUNT': '1.000', 'weapon2': '1.454', 'weapon3': '1.792'} [2024-08-05 06:58:19,813][00139] DAMAGECOUNT value on done: 43388.0 [2024-08-05 06:58:19,814][00139] Sum rewards: -5.095, reward structure: {'DEATHCOUNT': '-8.250', 'FRAGCOUNT': '-1.500', 'HEALTH': '-0.372', 'AMMO5': '0.007', 'AMMO2': '0.011', 'WEAPON4': '0.050', 'AMMO4': '0.056', 'weapon5': '0.060', 'weapon4': '0.066', 'WEAPON5': '0.100', 'AMMO3': '0.120', 'HITCOUNT': '0.130', 'DAMAGECOUNT': '0.438', 'WEAPON3': '0.600', 'weapon2': '1.478', 'weapon3': '1.910'} [2024-08-05 06:58:20,483][00034] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1138.5). Total num frames: 5136384. Throughput: 0: 286.6. Samples: 1284994. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 06:58:20,484][00034] Avg episode reward: [(0, '-4.471')] [2024-08-05 06:58:25,483][00034] Fps is (10 sec: 1638.4, 60 sec: 1228.8, 300 sec: 1166.3). Total num frames: 5144576. Throughput: 0: 287.3. Samples: 1286757. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 06:58:25,484][00034] Avg episode reward: [(0, '-4.471')] [2024-08-05 06:58:30,483][00034] Fps is (10 sec: 1638.4, 60 sec: 1228.8, 300 sec: 1166.3). Total num frames: 5152768. Throughput: 0: 287.9. Samples: 1287638. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 06:58:30,484][00034] Avg episode reward: [(0, '-4.471')] [2024-08-05 06:58:34,009][00139] DAMAGECOUNT value on done: 42524.0 [2024-08-05 06:58:34,010][00139] Sum rewards: -6.998, reward structure: {'DEATHCOUNT': '-12.000', 'HEALTH': '-0.868', 'AMMO5': '0.003', 'AMMO2': '0.013', 'WEAPON1': '0.020', 'ARMOR': '0.024', 'WEAPON5': '0.050', 'WEAPON4': '0.050', 'AMMO4': '0.065', 'HITCOUNT': '0.080', 'weapon4': '0.092', 'AMMO3': '0.153', 'DAMAGECOUNT': '0.252', 'WEAPON3': '0.650', 'FRAGCOUNT': '1.000', 'weapon2': '1.674', 'weapon3': '1.744'} [2024-08-05 06:58:34,241][00139] DAMAGECOUNT value on done: 43628.0 [2024-08-05 06:58:34,241][00139] Sum rewards: -3.317, reward structure: {'DEATHCOUNT': '-8.250', 'HEALTH': '-2.116', 'weapon5': '0.006', 'AMMO5': '0.010', 'AMMO2': '0.022', 'weapon4': '0.030', 'WEAPON5': '0.100', 'AMMO4': '0.109', 'AMMO3': '0.112', 'HITCOUNT': '0.190', 'WEAPON4': '0.200', 'ARMOR': '0.472', 'WEAPON3': '0.600', 'DAMAGECOUNT': '0.720', 'FRAGCOUNT': '1.000', 'weapon2': '1.712', 'weapon3': '1.766'} [2024-08-05 06:58:35,483][00034] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1138.5). Total num frames: 5152768. Throughput: 0: 288.2. Samples: 1289367. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 06:58:35,485][00034] Avg episode reward: [(0, '-4.421')] [2024-08-05 06:58:37,261][00138] Updated weights for policy 0, policy_version 630 (0.0017) [2024-08-05 06:58:40,483][00034] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1138.5). Total num frames: 5160960. Throughput: 0: 288.1. Samples: 1291090. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 06:58:40,484][00034] Avg episode reward: [(0, '-4.421')] [2024-08-05 06:58:45,483][00034] Fps is (10 sec: 1638.4, 60 sec: 1228.8, 300 sec: 1166.3). Total num frames: 5169152. Throughput: 0: 288.4. Samples: 1291938. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 06:58:45,484][00034] Avg episode reward: [(0, '-4.421')] [2024-08-05 06:58:48,704][00139] DAMAGECOUNT value on done: 42917.0 [2024-08-05 06:58:48,705][00139] Sum rewards: -1.640, reward structure: {'DEATHCOUNT': '-9.000', 'HEALTH': '-1.090', 'AMMO4': '-0.006', 'AMMO2': '-0.001', 'ARMOR': '0.004', 'AMMO5': '0.007', 'weapon5': '0.014', 'WEAPON5': '0.050', 'WEAPON4': '0.100', 'AMMO3': '0.167', 'weapon4': '0.262', 'HITCOUNT': '0.350', 'WEAPON3': '0.850', 'DAMAGECOUNT': '1.179', 'weapon2': '1.280', 'weapon3': '1.694', 'FRAGCOUNT': '2.500'} [2024-08-05 06:58:48,949][00139] DAMAGECOUNT value on done: 44097.0 [2024-08-05 06:58:48,950][00139] Sum rewards: -4.123, reward structure: {'DEATHCOUNT': '-10.500', 'HEALTH': '-1.020', 'AMMO5': '0.009', 'WEAPON1': '0.010', 'AMMO2': '0.029', 'weapon5': '0.036', 'ARMOR': '0.060', 'AMMO4': '0.142', 'WEAPON4': '0.150', 'weapon4': '0.154', 'AMMO3': '0.158', 'WEAPON5': '0.200', 'HITCOUNT': '0.410', 'FRAGCOUNT': '0.500', 'WEAPON3': '0.850', 'weapon2': '1.202', 'DAMAGECOUNT': '1.407', 'weapon3': '2.080'} [2024-08-05 06:58:50,483][00034] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1138.5). Total num frames: 5169152. Throughput: 0: 287.9. Samples: 1293654. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 06:58:50,484][00034] Avg episode reward: [(0, '-4.468')] [2024-08-05 06:58:55,483][00034] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1138.5). Total num frames: 5177344. Throughput: 0: 288.8. Samples: 1295394. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 06:58:55,484][00034] Avg episode reward: [(0, '-4.468')] [2024-08-05 06:59:00,483][00034] Fps is (10 sec: 1638.4, 60 sec: 1228.8, 300 sec: 1166.3). Total num frames: 5185536. Throughput: 0: 288.2. Samples: 1296275. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 06:59:00,485][00034] Avg episode reward: [(0, '-4.468')] [2024-08-05 06:59:03,346][00139] DAMAGECOUNT value on done: 43096.0 [2024-08-05 06:59:03,562][00139] DAMAGECOUNT value on done: 44631.0 [2024-08-05 06:59:03,563][00139] Sum rewards: 0.076, reward structure: {'DEATHCOUNT': '-8.250', 'HEALTH': '-0.196', 'AMMO5': '0.012', 'weapon4': '0.018', 'WEAPON1': '0.020', 'AMMO2': '0.032', 'weapon5': '0.082', 'WEAPON4': '0.100', 'AMMO3': '0.123', 'AMMO4': '0.162', 'WEAPON5': '0.200', 'HITCOUNT': '0.300', 'WEAPON3': '0.500', 'weapon2': '1.300', 'DAMAGECOUNT': '1.602', 'FRAGCOUNT': '2.000', 'weapon3': '2.070'} [2024-08-05 06:59:05,483][00034] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1138.5). Total num frames: 5185536. Throughput: 0: 288.2. Samples: 1297965. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 06:59:05,484][00034] Avg episode reward: [(0, '-4.442')] [2024-08-05 06:59:10,483][00034] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1138.5). Total num frames: 5193728. Throughput: 0: 288.2. Samples: 1299728. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 06:59:10,484][00034] Avg episode reward: [(0, '-4.442')] [2024-08-05 06:59:15,483][00034] Fps is (10 sec: 1638.4, 60 sec: 1228.8, 300 sec: 1166.3). Total num frames: 5201920. Throughput: 0: 287.6. Samples: 1300580. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 06:59:15,485][00034] Avg episode reward: [(0, '-4.442')] [2024-08-05 06:59:17,637][00139] DAMAGECOUNT value on done: 43557.0 [2024-08-05 06:59:17,637][00139] Sum rewards: -0.144, reward structure: {'DEATHCOUNT': '-9.750', 'HEALTH': '-1.184', 'AMMO5': '0.005', 'AMMO2': '0.013', 'ARMOR': '0.024', 'weapon5': '0.052', 'AMMO4': '0.064', 'WEAPON4': '0.100', 'WEAPON5': '0.100', 'AMMO3': '0.155', 'HITCOUNT': '0.380', 'WEAPON3': '0.950', 'DAMAGECOUNT': '1.383', 'weapon2': '1.700', 'weapon3': '1.864', 'FRAGCOUNT': '4.000'} [2024-08-05 06:59:17,861][00139] DAMAGECOUNT value on done: 44751.0 [2024-08-05 06:59:20,483][00034] Fps is (10 sec: 1638.4, 60 sec: 1228.8, 300 sec: 1166.3). Total num frames: 5210112. Throughput: 0: 288.9. Samples: 1302366. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 06:59:20,484][00034] Avg episode reward: [(0, '-4.443')] [2024-08-05 06:59:25,483][00034] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1138.5). Total num frames: 5210112. Throughput: 0: 290.3. Samples: 1304152. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 06:59:25,484][00034] Avg episode reward: [(0, '-4.443')] [2024-08-05 06:59:30,483][00034] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1138.5). Total num frames: 5218304. Throughput: 0: 290.7. Samples: 1305019. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 06:59:30,484][00034] Avg episode reward: [(0, '-4.443')] [2024-08-05 06:59:32,114][00139] DAMAGECOUNT value on done: 43942.0 [2024-08-05 06:59:32,114][00139] Sum rewards: 1.878, reward structure: {'DEATHCOUNT': '-7.500', 'HEALTH': '-1.285', 'AMMO2': '0.003', 'AMMO4': '0.015', 'ARMOR': '0.040', 'weapon7': '0.052', 'weapon4': '0.110', 'AMMO3': '0.118', 'AMMO6': '0.120', 'AMMO7': '0.120', 'WEAPON4': '0.150', 'WEAPON7': '0.200', 'HITCOUNT': '0.290', 'WEAPON3': '0.850', 'DAMAGECOUNT': '1.155', 'weapon2': '1.508', 'weapon3': '1.932', 'FRAGCOUNT': '4.000'} [2024-08-05 06:59:32,354][00139] DAMAGECOUNT value on done: 45256.0 [2024-08-05 06:59:32,355][00139] Sum rewards: 2.391, reward structure: {'DEATHCOUNT': '-9.000', 'HEALTH': '-0.052', 'AMMO5': '0.012', 'AMMO2': '0.031', 'WEAPON5': '0.100', 'AMMO3': '0.116', 'WEAPON4': '0.150', 'AMMO4': '0.156', 'weapon4': '0.158', 'weapon5': '0.176', 'HITCOUNT': '0.220', 'ARMOR': '0.400', 'WEAPON3': '0.600', 'weapon2': '1.384', 'weapon3': '1.424', 'DAMAGECOUNT': '1.515', 'FRAGCOUNT': '5.000'} [2024-08-05 06:59:35,483][00034] Fps is (10 sec: 1638.4, 60 sec: 1228.8, 300 sec: 1166.3). Total num frames: 5226496. Throughput: 0: 289.4. Samples: 1306678. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 06:59:35,486][00034] Avg episode reward: [(0, '-4.322')] [2024-08-05 06:59:35,494][00132] Saving /kaggle/working/train_dir/default_experiment/checkpoint_p0/checkpoint_000000638_5226496.pth... [2024-08-05 06:59:35,565][00132] Removing /kaggle/working/train_dir/default_experiment/checkpoint_p0/checkpoint_000000604_4947968.pth [2024-08-05 06:59:40,483][00034] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1138.5). Total num frames: 5226496. Throughput: 0: 290.8. Samples: 1308479. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 06:59:40,484][00034] Avg episode reward: [(0, '-4.322')] [2024-08-05 06:59:45,483][00034] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1138.5). Total num frames: 5234688. Throughput: 0: 290.5. Samples: 1309349. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 06:59:45,484][00034] Avg episode reward: [(0, '-4.322')] [2024-08-05 06:59:46,413][00139] DAMAGECOUNT value on done: 44259.0 [2024-08-05 06:59:46,413][00139] Sum rewards: -2.664, reward structure: {'DEATHCOUNT': '-10.500', 'HEALTH': '-0.136', 'AMMO2': '0.005', 'ARMOR': '0.012', 'weapon4': '0.016', 'AMMO5': '0.017', 'AMMO4': '0.023', 'WEAPON1': '0.050', 'WEAPON4': '0.050', 'AMMO3': '0.145', 'weapon5': '0.192', 'HITCOUNT': '0.200', 'WEAPON5': '0.400', 'WEAPON3': '0.800', 'DAMAGECOUNT': '0.951', 'weapon2': '1.134', 'weapon3': '1.976', 'FRAGCOUNT': '2.000'} [2024-08-05 06:59:46,627][00139] DAMAGECOUNT value on done: 45625.0 [2024-08-05 06:59:46,627][00139] Sum rewards: -2.129, reward structure: {'DEATHCOUNT': '-10.500', 'HEALTH': '-1.080', 'AMMO5': '0.010', 'AMMO2': '0.016', 'ARMOR': '0.050', 'weapon4': '0.080', 'AMMO4': '0.082', 'WEAPON4': '0.100', 'weapon5': '0.132', 'AMMO3': '0.182', 'HITCOUNT': '0.190', 'WEAPON5': '0.250', 'WEAPON3': '0.800', 'DAMAGECOUNT': '1.107', 'weapon2': '1.656', 'weapon3': '1.796', 'FRAGCOUNT': '3.000'} [2024-08-05 06:59:47,843][00138] Updated weights for policy 0, policy_version 640 (0.0017) [2024-08-05 06:59:50,483][00034] Fps is (10 sec: 1638.4, 60 sec: 1228.8, 300 sec: 1166.3). Total num frames: 5242880. Throughput: 0: 292.2. Samples: 1311115. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 06:59:50,485][00034] Avg episode reward: [(0, '-4.234')] [2024-08-05 06:59:55,483][00034] Fps is (10 sec: 1638.4, 60 sec: 1228.8, 300 sec: 1166.3). Total num frames: 5251072. Throughput: 0: 292.5. Samples: 1312892. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 06:59:55,484][00034] Avg episode reward: [(0, '-4.234')] [2024-08-05 07:00:00,483][00034] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1138.5). Total num frames: 5251072. Throughput: 0: 293.3. Samples: 1313779. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 07:00:00,484][00034] Avg episode reward: [(0, '-4.234')] [2024-08-05 07:00:00,718][00139] DAMAGECOUNT value on done: 44848.0 [2024-08-05 07:00:00,719][00139] Sum rewards: 3.975, reward structure: {'DEATHCOUNT': '-7.500', 'HEALTH': '-0.466', 'AMMO5': '0.005', 'WEAPON1': '0.010', 'AMMO2': '0.023', 'ARMOR': '0.040', 'weapon5': '0.058', 'WEAPON5': '0.100', 'AMMO4': '0.114', 'AMMO3': '0.132', 'WEAPON4': '0.200', 'weapon4': '0.238', 'HITCOUNT': '0.390', 'WEAPON3': '0.650', 'weapon2': '1.102', 'DAMAGECOUNT': '1.767', 'weapon3': '2.112', 'FRAGCOUNT': '5.000'} [2024-08-05 07:00:00,937][00139] DAMAGECOUNT value on done: 45870.0 [2024-08-05 07:00:00,937][00139] Sum rewards: -2.894, reward structure: {'DEATHCOUNT': '-9.750', 'HEALTH': '-0.803', 'AMMO2': '0.020', 'ARMOR': '0.060', 'AMMO4': '0.099', 'AMMO3': '0.117', 'WEAPON4': '0.150', 'HITCOUNT': '0.190', 'weapon4': '0.416', 'DAMAGECOUNT': '0.735', 'WEAPON3': '0.800', 'weapon2': '1.228', 'weapon3': '1.844', 'FRAGCOUNT': '2.000'} [2024-08-05 07:00:05,483][00034] Fps is (10 sec: 819.2, 60 sec: 1228.8, 300 sec: 1138.5). Total num frames: 5259264. Throughput: 0: 291.7. Samples: 1315491. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 07:00:05,487][00034] Avg episode reward: [(0, '-4.170')] [2024-08-05 07:00:10,483][00034] Fps is (10 sec: 1638.4, 60 sec: 1228.8, 300 sec: 1166.3). Total num frames: 5267456. Throughput: 0: 291.4. Samples: 1317263. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 07:00:10,484][00034] Avg episode reward: [(0, '-4.170')] [2024-08-05 07:00:15,022][00139] DAMAGECOUNT value on done: 45165.0 [2024-08-05 07:00:15,245][00139] DAMAGECOUNT value on done: 46220.0 [2024-08-05 07:00:15,245][00139] Sum rewards: -1.660, reward structure: {'DEATHCOUNT': '-7.500', 'HEALTH': '-2.180', 'AMMO2': '0.013', 'AMMO5': '0.020', 'weapon5': '0.024', 'AMMO4': '0.066', 'AMMO3': '0.137', 'WEAPON5': '0.200', 'HITCOUNT': '0.250', 'WEAPON3': '0.700', 'DAMAGECOUNT': '1.050', 'weapon3': '1.374', 'FRAGCOUNT': '2.000', 'weapon2': '2.186'} [2024-08-05 07:00:15,483][00034] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1138.5). Total num frames: 5267456. Throughput: 0: 291.5. Samples: 1318137. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 07:00:15,484][00034] Avg episode reward: [(0, '-4.174')] [2024-08-05 07:00:20,483][00034] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1138.6). Total num frames: 5275648. Throughput: 0: 293.8. Samples: 1319897. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 07:00:20,485][00034] Avg episode reward: [(0, '-4.174')] [2024-08-05 07:00:25,483][00034] Fps is (10 sec: 1638.4, 60 sec: 1228.8, 300 sec: 1166.3). Total num frames: 5283840. Throughput: 0: 292.1. Samples: 1321622. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 07:00:25,484][00034] Avg episode reward: [(0, '-4.174')] [2024-08-05 07:00:29,738][00139] DAMAGECOUNT value on done: 45360.0 [2024-08-05 07:00:29,738][00139] Sum rewards: -3.900, reward structure: {'DEATHCOUNT': '-9.750', 'HEALTH': '-0.812', 'AMMO5': '0.003', 'weapon5': '0.006', 'AMMO2': '0.016', 'WEAPON5': '0.050', 'AMMO4': '0.080', 'AMMO3': '0.126', 'HITCOUNT': '0.210', 'WEAPON4': '0.350', 'ARMOR': '0.460', 'FRAGCOUNT': '0.500', 'DAMAGECOUNT': '0.585', 'weapon4': '0.590', 'WEAPON3': '0.750', 'weapon2': '1.058', 'weapon3': '1.878'} [2024-08-05 07:00:29,975][00139] DAMAGECOUNT value on done: 46482.0 [2024-08-05 07:00:29,976][00139] Sum rewards: -2.244, reward structure: {'DEATHCOUNT': '-11.250', 'HEALTH': '-1.243', 'AMMO2': '0.008', 'weapon5': '0.010', 'AMMO5': '0.013', 'AMMO4': '0.040', 'weapon7': '0.064', 'weapon4': '0.078', 'WEAPON4': '0.100', 'AMMO6': '0.120', 'AMMO7': '0.120', 'AMMO3': '0.176', 'HITCOUNT': '0.200', 'WEAPON7': '0.200', 'WEAPON5': '0.250', 'DAMAGECOUNT': '0.786', 'WEAPON3': '1.000', 'weapon2': '1.436', 'weapon3': '1.648', 'FRAGCOUNT': '4.000'} [2024-08-05 07:00:30,483][00034] Fps is (10 sec: 1638.4, 60 sec: 1228.8, 300 sec: 1166.3). Total num frames: 5292032. Throughput: 0: 291.7. Samples: 1322477. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 07:00:30,484][00034] Avg episode reward: [(0, '-4.054')] [2024-08-05 07:00:35,483][00034] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1138.5). Total num frames: 5292032. Throughput: 0: 290.6. Samples: 1324191. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 07:00:35,484][00034] Avg episode reward: [(0, '-4.054')] [2024-08-05 07:00:40,483][00034] Fps is (10 sec: 819.2, 60 sec: 1228.8, 300 sec: 1166.3). Total num frames: 5300224. Throughput: 0: 289.0. Samples: 1325896. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 07:00:40,484][00034] Avg episode reward: [(0, '-4.054')] [2024-08-05 07:00:44,316][00139] DAMAGECOUNT value on done: 45430.0 [2024-08-05 07:00:44,316][00139] Sum rewards: -1.341, reward structure: {'DEATHCOUNT': '-6.000', 'HEALTH': '-1.120', 'AMMO4': '-0.006', 'AMMO2': '-0.001', 'AMMO5': '0.003', 'weapon5': '0.012', 'WEAPON4': '0.050', 'WEAPON5': '0.050', 'ARMOR': '0.072', 'weapon4': '0.076', 'HITCOUNT': '0.080', 'AMMO3': '0.118', 'DAMAGECOUNT': '0.210', 'WEAPON3': '0.650', 'FRAGCOUNT': '1.000', 'weapon2': '1.194', 'weapon3': '2.272'} [2024-08-05 07:00:44,540][00139] DAMAGECOUNT value on done: 46825.0 [2024-08-05 07:00:44,540][00139] Sum rewards: -3.550, reward structure: {'DEATHCOUNT': '-10.500', 'HEALTH': '-0.772', 'AMMO5': '0.007', 'AMMO2': '0.032', 'ARMOR': '0.040', 'weapon5': '0.054', 'weapon4': '0.140', 'WEAPON5': '0.150', 'WEAPON4': '0.150', 'AMMO4': '0.157', 'HITCOUNT': '0.180', 'AMMO3': '0.184', 'WEAPON3': '0.900', 'DAMAGECOUNT': '1.029', 'weapon3': '1.410', 'FRAGCOUNT': '1.500', 'weapon2': '1.788'} [2024-08-05 07:00:45,483][00034] Fps is (10 sec: 1638.4, 60 sec: 1228.8, 300 sec: 1166.3). Total num frames: 5308416. Throughput: 0: 289.1. Samples: 1326790. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 07:00:45,485][00034] Avg episode reward: [(0, '-4.098')] [2024-08-05 07:00:50,483][00034] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1138.5). Total num frames: 5308416. Throughput: 0: 289.8. Samples: 1328531. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 07:00:50,484][00034] Avg episode reward: [(0, '-4.098')] [2024-08-05 07:00:55,483][00034] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1166.3). Total num frames: 5316608. Throughput: 0: 289.6. Samples: 1330297. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 07:00:55,484][00034] Avg episode reward: [(0, '-4.098')] [2024-08-05 07:00:58,399][00138] Updated weights for policy 0, policy_version 650 (0.0018) [2024-08-05 07:00:58,854][00139] DAMAGECOUNT value on done: 45767.0 [2024-08-05 07:00:58,854][00139] Sum rewards: -4.737, reward structure: {'DEATHCOUNT': '-9.750', 'HEALTH': '-1.560', 'FRAGCOUNT': '-0.500', 'AMMO5': '0.011', 'AMMO2': '0.019', 'weapon5': '0.030', 'weapon7': '0.034', 'ARMOR': '0.040', 'AMMO4': '0.094', 'AMMO3': '0.137', 'WEAPON4': '0.150', 'HITCOUNT': '0.210', 'WEAPON5': '0.250', 'weapon4': '0.260', 'AMMO6': '0.300', 'WEAPON7': '0.300', 'AMMO7': '0.300', 'WEAPON3': '0.800', 'DAMAGECOUNT': '1.011', 'weapon2': '1.416', 'weapon3': '1.710'} [2024-08-05 07:00:59,087][00139] DAMAGECOUNT value on done: 47124.0 [2024-08-05 07:00:59,087][00139] Sum rewards: -4.726, reward structure: {'DEATHCOUNT': '-10.500', 'HEALTH': '-1.466', 'AMMO5': '0.005', 'ARMOR': '0.008', 'AMMO2': '0.010', 'AMMO4': '0.047', 'weapon5': '0.054', 'weapon7': '0.084', 'AMMO6': '0.120', 'AMMO7': '0.120', 'AMMO3': '0.139', 'WEAPON5': '0.150', 'WEAPON7': '0.200', 'weapon4': '0.218', 'HITCOUNT': '0.220', 'WEAPON4': '0.250', 'WEAPON3': '0.750', 'DAMAGECOUNT': '0.897', 'FRAGCOUNT': '1.000', 'weapon3': '1.434', 'weapon2': '1.534'} [2024-08-05 07:01:00,483][00034] Fps is (10 sec: 1638.4, 60 sec: 1228.8, 300 sec: 1166.3). Total num frames: 5324800. Throughput: 0: 289.2. Samples: 1331151. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 07:01:00,484][00034] Avg episode reward: [(0, '-4.055')] [2024-08-05 07:01:05,483][00034] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1138.5). Total num frames: 5324800. Throughput: 0: 287.1. Samples: 1332818. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 07:01:05,484][00034] Avg episode reward: [(0, '-4.055')] [2024-08-05 07:01:10,483][00034] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1138.5). Total num frames: 5332992. Throughput: 0: 286.8. Samples: 1334527. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 07:01:10,484][00034] Avg episode reward: [(0, '-4.055')] [2024-08-05 07:01:13,653][00139] DAMAGECOUNT value on done: 45981.0 [2024-08-05 07:01:13,654][00139] Sum rewards: -5.236, reward structure: {'DEATHCOUNT': '-12.000', 'HEALTH': '-1.805', 'AMMO5': '0.007', 'WEAPON1': '0.010', 'AMMO2': '0.015', 'weapon5': '0.020', 'ARMOR': '0.032', 'AMMO4': '0.073', 'weapon4': '0.096', 'WEAPON5': '0.100', 'WEAPON4': '0.100', 'AMMO3': '0.170', 'HITCOUNT': '0.180', 'DAMAGECOUNT': '0.642', 'WEAPON3': '1.000', 'weapon2': '1.258', 'weapon3': '1.866', 'FRAGCOUNT': '3.000'} [2024-08-05 07:01:13,863][00139] DAMAGECOUNT value on done: 47141.0 [2024-08-05 07:01:15,483][00034] Fps is (10 sec: 1638.4, 60 sec: 1228.8, 300 sec: 1166.3). Total num frames: 5341184. Throughput: 0: 286.8. Samples: 1335381. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 07:01:15,484][00034] Avg episode reward: [(0, '-4.090')] [2024-08-05 07:01:20,483][00034] Fps is (10 sec: 1638.4, 60 sec: 1228.8, 300 sec: 1166.3). Total num frames: 5349376. Throughput: 0: 288.9. Samples: 1337191. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 07:01:20,484][00034] Avg episode reward: [(0, '-4.090')] [2024-08-05 07:01:25,483][00034] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1138.5). Total num frames: 5349376. Throughput: 0: 290.5. Samples: 1338967. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 07:01:25,484][00034] Avg episode reward: [(0, '-4.090')] [2024-08-05 07:01:27,849][00139] DAMAGECOUNT value on done: 46244.0 [2024-08-05 07:01:27,850][00139] Sum rewards: -0.259, reward structure: {'DEATHCOUNT': '-6.000', 'HEALTH': '-1.076', 'AMMO2': '0.005', 'AMMO5': '0.011', 'AMMO4': '0.024', 'weapon4': '0.062', 'WEAPON4': '0.100', 'weapon5': '0.104', 'AMMO3': '0.114', 'WEAPON5': '0.150', 'HITCOUNT': '0.210', 'ARMOR': '0.404', 'WEAPON3': '0.650', 'DAMAGECOUNT': '0.789', 'FRAGCOUNT': '1.000', 'weapon2': '1.398', 'weapon3': '1.796'} [2024-08-05 07:01:28,076][00139] DAMAGECOUNT value on done: 47264.0 [2024-08-05 07:01:30,483][00034] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1166.3). Total num frames: 5357568. Throughput: 0: 289.4. Samples: 1339813. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 07:01:30,484][00034] Avg episode reward: [(0, '-4.102')] [2024-08-05 07:01:35,483][00034] Fps is (10 sec: 1638.4, 60 sec: 1228.8, 300 sec: 1166.3). Total num frames: 5365760. Throughput: 0: 289.4. Samples: 1341553. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 07:01:35,484][00034] Avg episode reward: [(0, '-4.102')] [2024-08-05 07:01:35,491][00132] Saving /kaggle/working/train_dir/default_experiment/checkpoint_p0/checkpoint_000000655_5365760.pth... [2024-08-05 07:01:35,561][00132] Removing /kaggle/working/train_dir/default_experiment/checkpoint_p0/checkpoint_000000621_5087232.pth [2024-08-05 07:01:40,483][00034] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1138.5). Total num frames: 5365760. Throughput: 0: 288.4. Samples: 1343276. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 07:01:40,484][00034] Avg episode reward: [(0, '-4.102')] [2024-08-05 07:01:42,416][00139] DAMAGECOUNT value on done: 46376.0 [2024-08-05 07:01:42,416][00139] Sum rewards: -3.957, reward structure: {'DEATHCOUNT': '-9.000', 'FRAGCOUNT': '-0.500', 'WEAPON1': '0.010', 'AMMO5': '0.012', 'AMMO2': '0.019', 'HITCOUNT': '0.040', 'AMMO3': '0.075', 'AMMO4': '0.095', 'WEAPON4': '0.100', 'weapon4': '0.136', 'weapon5': '0.152', 'HEALTH': '0.216', 'WEAPON5': '0.300', 'DAMAGECOUNT': '0.396', 'WEAPON3': '0.400', 'ARMOR': '0.510', 'weapon3': '1.064', 'weapon2': '2.018'} [2024-08-05 07:01:42,655][00139] DAMAGECOUNT value on done: 47634.0 [2024-08-05 07:01:42,655][00139] Sum rewards: -4.018, reward structure: {'DEATHCOUNT': '-11.250', 'HEALTH': '-1.505', 'AMMO5': '0.009', 'weapon5': '0.020', 'ARMOR': '0.028', 'AMMO2': '0.031', 'weapon7': '0.058', 'WEAPON5': '0.100', 'AMMO6': '0.120', 'AMMO7': '0.120', 'AMMO4': '0.154', 'AMMO3': '0.172', 'WEAPON7': '0.200', 'HITCOUNT': '0.240', 'WEAPON4': '0.350', 'weapon4': '0.478', 'WEAPON3': '0.900', 'weapon2': '1.060', 'DAMAGECOUNT': '1.110', 'weapon3': '1.586', 'FRAGCOUNT': '2.000'} [2024-08-05 07:01:45,483][00034] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1138.5). Total num frames: 5373952. Throughput: 0: 288.5. Samples: 1344133. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 07:01:45,484][00034] Avg episode reward: [(0, '-4.002')] [2024-08-05 07:01:50,483][00034] Fps is (10 sec: 1638.4, 60 sec: 1228.8, 300 sec: 1166.3). Total num frames: 5382144. Throughput: 0: 290.0. Samples: 1345866. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 07:01:50,484][00034] Avg episode reward: [(0, '-4.002')] [2024-08-05 07:01:55,483][00034] Fps is (10 sec: 1638.4, 60 sec: 1228.8, 300 sec: 1166.3). Total num frames: 5390336. Throughput: 0: 291.6. Samples: 1347650. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 07:01:55,485][00034] Avg episode reward: [(0, '-4.002')] [2024-08-05 07:01:56,851][00139] DAMAGECOUNT value on done: 46467.0 [2024-08-05 07:01:56,851][00139] Sum rewards: -8.021, reward structure: {'DEATHCOUNT': '-13.500', 'HEALTH': '-0.302', 'AMMO5': '0.007', 'weapon5': '0.012', 'AMMO2': '0.015', 'WEAPON1': '0.020', 'AMMO4': '0.074', 'HITCOUNT': '0.090', 'WEAPON5': '0.150', 'AMMO3': '0.193', 'DAMAGECOUNT': '0.273', 'WEAPON3': '0.950', 'FRAGCOUNT': '1.000', 'weapon2': '1.398', 'weapon3': '1.598'} [2024-08-05 07:01:57,087][00139] DAMAGECOUNT value on done: 47919.0 [2024-08-05 07:01:57,087][00139] Sum rewards: -9.600, reward structure: {'DEATHCOUNT': '-14.250', 'HEALTH': '-2.106', 'AMMO5': '0.012', 'AMMO2': '0.018', 'ARMOR': '0.032', 'weapon5': '0.048', 'WEAPON4': '0.050', 'AMMO4': '0.090', 'AMMO3': '0.199', 'WEAPON5': '0.200', 'HITCOUNT': '0.250', 'DAMAGECOUNT': '0.855', 'FRAGCOUNT': '1.000', 'WEAPON3': '1.100', 'weapon3': '1.406', 'weapon2': '1.496'} [2024-08-05 07:02:00,483][00034] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1138.5). Total num frames: 5390336. Throughput: 0: 291.3. Samples: 1348489. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 07:02:00,484][00034] Avg episode reward: [(0, '-4.146')] [2024-08-05 07:02:05,483][00034] Fps is (10 sec: 819.2, 60 sec: 1228.8, 300 sec: 1166.3). Total num frames: 5398528. Throughput: 0: 289.4. Samples: 1350213. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 07:02:05,484][00034] Avg episode reward: [(0, '-4.146')] [2024-08-05 07:02:09,400][00138] Updated weights for policy 0, policy_version 660 (0.0017) [2024-08-05 07:02:10,483][00034] Fps is (10 sec: 1638.4, 60 sec: 1228.8, 300 sec: 1166.3). Total num frames: 5406720. Throughput: 0: 286.9. Samples: 1351879. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 07:02:10,484][00034] Avg episode reward: [(0, '-4.146')] [2024-08-05 07:02:11,657][00139] DAMAGECOUNT value on done: 46687.0 [2024-08-05 07:02:11,658][00139] Sum rewards: -0.105, reward structure: {'DEATHCOUNT': '-8.250', 'AMMO5': '0.004', 'AMMO2': '0.005', 'AMMO4': '0.027', 'weapon5': '0.028', 'ARMOR': '0.043', 'WEAPON5': '0.100', 'AMMO3': '0.119', 'HEALTH': '0.148', 'HITCOUNT': '0.160', 'DAMAGECOUNT': '0.660', 'WEAPON3': '0.700', 'weapon2': '1.500', 'weapon3': '1.650', 'FRAGCOUNT': '3.000'} [2024-08-05 07:02:11,880][00139] DAMAGECOUNT value on done: 48214.0 [2024-08-05 07:02:11,881][00139] Sum rewards: -0.271, reward structure: {'DEATHCOUNT': '-7.500', 'HEALTH': '-1.054', 'AMMO2': '0.005', 'AMMO4': '0.025', 'WEAPON4': '0.050', 'weapon7': '0.076', 'AMMO6': '0.100', 'AMMO7': '0.100', 'WEAPON7': '0.100', 'AMMO3': '0.110', 'HITCOUNT': '0.210', 'weapon4': '0.310', 'ARMOR': '0.492', 'WEAPON3': '0.650', 'DAMAGECOUNT': '0.885', 'weapon3': '1.068', 'weapon2': '1.102', 'FRAGCOUNT': '3.000'} [2024-08-05 07:02:15,483][00034] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1138.5). Total num frames: 5406720. Throughput: 0: 288.0. Samples: 1352775. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 07:02:15,484][00034] Avg episode reward: [(0, '-4.032')] [2024-08-05 07:02:20,483][00034] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1166.3). Total num frames: 5414912. Throughput: 0: 288.0. Samples: 1354515. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 07:02:20,484][00034] Avg episode reward: [(0, '-4.032')] [2024-08-05 07:02:25,483][00034] Fps is (10 sec: 1638.4, 60 sec: 1228.8, 300 sec: 1166.3). Total num frames: 5423104. Throughput: 0: 288.3. Samples: 1356248. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 07:02:25,485][00034] Avg episode reward: [(0, '-4.032')] [2024-08-05 07:02:26,123][00139] DAMAGECOUNT value on done: 46807.0 [2024-08-05 07:02:26,123][00139] Sum rewards: -5.781, reward structure: {'DEATHCOUNT': '-9.000', 'HEALTH': '-2.110', 'AMMO5': '0.009', 'WEAPON1': '0.010', 'AMMO2': '0.012', 'weapon5': '0.046', 'AMMO4': '0.061', 'HITCOUNT': '0.120', 'AMMO3': '0.139', 'WEAPON4': '0.200', 'WEAPON5': '0.200', 'weapon4': '0.286', 'DAMAGECOUNT': '0.360', 'FRAGCOUNT': '0.500', 'WEAPON3': '0.750', 'weapon2': '1.114', 'weapon3': '1.522'} [2024-08-05 07:02:26,353][00139] DAMAGECOUNT value on done: 48419.0 [2024-08-05 07:02:26,354][00139] Sum rewards: -4.440, reward structure: {'DEATHCOUNT': '-8.250', 'HEALTH': '-0.754', 'FRAGCOUNT': '-0.500', 'AMMO5': '0.007', 'AMMO2': '0.012', 'AMMO4': '0.061', 'weapon5': '0.080', 'AMMO3': '0.124', 'WEAPON5': '0.150', 'WEAPON4': '0.150', 'weapon4': '0.170', 'HITCOUNT': '0.180', 'DAMAGECOUNT': '0.615', 'WEAPON3': '0.700', 'weapon2': '1.096', 'weapon3': '1.718'} [2024-08-05 07:02:30,483][00034] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1138.5). Total num frames: 5423104. Throughput: 0: 288.6. Samples: 1357120. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 07:02:30,484][00034] Avg episode reward: [(0, '-4.074')] [2024-08-05 07:02:35,483][00034] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1138.5). Total num frames: 5431296. Throughput: 0: 288.8. Samples: 1358862. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 07:02:35,484][00034] Avg episode reward: [(0, '-4.074')] [2024-08-05 07:02:40,483][00034] Fps is (10 sec: 1638.4, 60 sec: 1228.8, 300 sec: 1166.3). Total num frames: 5439488. Throughput: 0: 287.5. Samples: 1360589. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 07:02:40,484][00034] Avg episode reward: [(0, '-4.074')] [2024-08-05 07:02:40,698][00139] DAMAGECOUNT value on done: 46937.0 [2024-08-05 07:02:40,952][00139] DAMAGECOUNT value on done: 48454.0 [2024-08-05 07:02:40,952][00139] Sum rewards: -9.785, reward structure: {'DEATHCOUNT': '-10.500', 'HEALTH': '-2.390', 'FRAGCOUNT': '-1.500', 'AMMO2': '0.002', 'AMMO5': '0.005', 'AMMO4': '0.007', 'weapon5': '0.014', 'HITCOUNT': '0.030', 'weapon7': '0.042', 'ARMOR': '0.056', 'WEAPON5': '0.100', 'DAMAGECOUNT': '0.105', 'AMMO3': '0.122', 'AMMO6': '0.160', 'AMMO7': '0.160', 'WEAPON7': '0.200', 'WEAPON3': '0.750', 'weapon2': '1.416', 'weapon3': '1.436'} [2024-08-05 07:02:45,483][00034] Fps is (10 sec: 1638.4, 60 sec: 1228.8, 300 sec: 1166.3). Total num frames: 5447680. Throughput: 0: 287.7. Samples: 1361435. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 07:02:45,484][00034] Avg episode reward: [(0, '-4.169')] [2024-08-05 07:02:50,483][00034] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1138.5). Total num frames: 5447680. Throughput: 0: 288.4. Samples: 1363192. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 07:02:50,485][00034] Avg episode reward: [(0, '-4.169')] [2024-08-05 07:02:55,077][00139] DAMAGECOUNT value on done: 47133.0 [2024-08-05 07:02:55,077][00139] Sum rewards: -2.727, reward structure: {'DEATHCOUNT': '-9.000', 'HEALTH': '-1.676', 'AMMO5': '0.003', 'AMMO2': '0.019', 'WEAPON1': '0.030', 'ARMOR': '0.032', 'WEAPON5': '0.050', 'AMMO4': '0.094', 'HITCOUNT': '0.120', 'weapon4': '0.140', 'AMMO3': '0.141', 'WEAPON4': '0.250', 'DAMAGECOUNT': '0.588', 'WEAPON3': '0.750', 'weapon2': '0.976', 'weapon3': '1.756', 'FRAGCOUNT': '3.000'} [2024-08-05 07:02:55,307][00139] DAMAGECOUNT value on done: 48675.0 [2024-08-05 07:02:55,483][00034] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1166.3). Total num frames: 5455872. Throughput: 0: 290.4. Samples: 1364948. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 07:02:55,484][00034] Avg episode reward: [(0, '-4.202')] [2024-08-05 07:03:00,483][00034] Fps is (10 sec: 1638.4, 60 sec: 1228.8, 300 sec: 1166.3). Total num frames: 5464064. Throughput: 0: 290.1. Samples: 1365830. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 07:03:00,484][00034] Avg episode reward: [(0, '-4.202')] [2024-08-05 07:03:05,483][00034] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1138.5). Total num frames: 5464064. Throughput: 0: 289.4. Samples: 1367538. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 07:03:05,484][00034] Avg episode reward: [(0, '-4.202')] [2024-08-05 07:03:09,685][00139] DAMAGECOUNT value on done: 47232.0 [2024-08-05 07:03:09,926][00139] DAMAGECOUNT value on done: 48745.0 [2024-08-05 07:03:10,483][00034] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1166.3). Total num frames: 5472256. Throughput: 0: 289.2. Samples: 1369260. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 07:03:10,485][00034] Avg episode reward: [(0, '-4.164')] [2024-08-05 07:03:15,483][00034] Fps is (10 sec: 1638.4, 60 sec: 1228.8, 300 sec: 1166.3). Total num frames: 5480448. Throughput: 0: 288.6. Samples: 1370109. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 07:03:15,485][00034] Avg episode reward: [(0, '-4.164')] [2024-08-05 07:03:20,137][00138] Updated weights for policy 0, policy_version 670 (0.0018) [2024-08-05 07:03:20,483][00034] Fps is (10 sec: 1638.4, 60 sec: 1228.8, 300 sec: 1166.3). Total num frames: 5488640. Throughput: 0: 289.7. Samples: 1371900. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 07:03:20,484][00034] Avg episode reward: [(0, '-4.164')] [2024-08-05 07:03:23,998][00139] DAMAGECOUNT value on done: 47297.0 [2024-08-05 07:03:24,226][00139] DAMAGECOUNT value on done: 48860.0 [2024-08-05 07:03:24,226][00139] Sum rewards: -4.154, reward structure: {'DEATHCOUNT': '-6.750', 'HEALTH': '-1.220', 'FRAGCOUNT': '-0.500', 'WEAPON1': '0.010', 'AMMO5': '0.011', 'AMMO2': '0.020', 'weapon5': '0.044', 'HITCOUNT': '0.060', 'ARMOR': '0.072', 'AMMO3': '0.081', 'AMMO4': '0.099', 'weapon4': '0.118', 'WEAPON5': '0.150', 'WEAPON4': '0.300', 'DAMAGECOUNT': '0.345', 'WEAPON3': '0.500', 'weapon2': '1.164', 'weapon3': '1.342'} [2024-08-05 07:03:25,483][00034] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1138.5). Total num frames: 5488640. Throughput: 0: 290.4. Samples: 1373657. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 07:03:25,484][00034] Avg episode reward: [(0, '-4.163')] [2024-08-05 07:03:30,483][00034] Fps is (10 sec: 819.2, 60 sec: 1228.8, 300 sec: 1166.3). Total num frames: 5496832. Throughput: 0: 290.8. Samples: 1374519. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 07:03:30,485][00034] Avg episode reward: [(0, '-4.163')] [2024-08-05 07:03:35,483][00034] Fps is (10 sec: 1638.4, 60 sec: 1228.8, 300 sec: 1166.3). Total num frames: 5505024. Throughput: 0: 291.2. Samples: 1376298. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 07:03:35,485][00034] Avg episode reward: [(0, '-4.163')] [2024-08-05 07:03:35,493][00132] Saving /kaggle/working/train_dir/default_experiment/checkpoint_p0/checkpoint_000000672_5505024.pth... [2024-08-05 07:03:35,566][00132] Removing /kaggle/working/train_dir/default_experiment/checkpoint_p0/checkpoint_000000638_5226496.pth [2024-08-05 07:03:38,407][00139] DAMAGECOUNT value on done: 47405.0 [2024-08-05 07:03:38,408][00139] Sum rewards: -3.506, reward structure: {'DEATHCOUNT': '-8.250', 'HEALTH': '-0.606', 'WEAPON1': '0.010', 'AMMO2': '0.014', 'AMMO5': '0.017', 'ARMOR': '0.020', 'HITCOUNT': '0.040', 'weapon5': '0.044', 'AMMO4': '0.069', 'WEAPON4': '0.100', 'AMMO3': '0.114', 'weapon4': '0.120', 'WEAPON5': '0.250', 'DAMAGECOUNT': '0.324', 'WEAPON3': '0.600', 'FRAGCOUNT': '1.000', 'weapon3': '1.296', 'weapon2': '1.332'} [2024-08-05 07:03:38,629][00139] DAMAGECOUNT value on done: 49200.0 [2024-08-05 07:03:38,630][00139] Sum rewards: -4.574, reward structure: {'DEATHCOUNT': '-10.500', 'HEALTH': '-1.554', 'AMMO5': '0.003', 'weapon5': '0.006', 'WEAPON1': '0.020', 'AMMO2': '0.027', 'WEAPON5': '0.050', 'ARMOR': '0.080', 'AMMO3': '0.112', 'AMMO4': '0.135', 'weapon4': '0.140', 'HITCOUNT': '0.240', 'WEAPON4': '0.350', 'WEAPON3': '0.700', 'DAMAGECOUNT': '1.020', 'weapon2': '1.150', 'weapon3': '1.448', 'FRAGCOUNT': '2.000'} [2024-08-05 07:03:40,483][00034] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1138.5). Total num frames: 5505024. Throughput: 0: 290.5. Samples: 1378022. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 07:03:40,484][00034] Avg episode reward: [(0, '-4.182')] [2024-08-05 07:03:45,483][00034] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1166.3). Total num frames: 5513216. Throughput: 0: 289.9. Samples: 1378875. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 07:03:45,485][00034] Avg episode reward: [(0, '-4.182')] [2024-08-05 07:03:50,483][00034] Fps is (10 sec: 1638.4, 60 sec: 1228.8, 300 sec: 1166.3). Total num frames: 5521408. Throughput: 0: 289.0. Samples: 1380542. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 07:03:50,484][00034] Avg episode reward: [(0, '-4.182')] [2024-08-05 07:03:53,124][00139] DAMAGECOUNT value on done: 47615.0 [2024-08-05 07:03:53,330][00139] DAMAGECOUNT value on done: 49444.0 [2024-08-05 07:03:53,331][00139] Sum rewards: -2.668, reward structure: {'DEATHCOUNT': '-7.500', 'HEALTH': '-0.922', 'AMMO2': '0.006', 'AMMO5': '0.009', 'WEAPON1': '0.010', 'AMMO4': '0.029', 'weapon4': '0.088', 'weapon5': '0.094', 'AMMO3': '0.100', 'WEAPON4': '0.100', 'HITCOUNT': '0.110', 'WEAPON5': '0.250', 'WEAPON3': '0.550', 'DAMAGECOUNT': '0.732', 'FRAGCOUNT': '1.000', 'weapon2': '1.220', 'weapon3': '1.456'} [2024-08-05 07:03:55,483][00034] Fps is (10 sec: 1638.4, 60 sec: 1228.8, 300 sec: 1166.3). Total num frames: 5529600. Throughput: 0: 290.6. Samples: 1382336. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 07:03:55,484][00034] Avg episode reward: [(0, '-4.158')] [2024-08-05 07:04:00,483][00034] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1166.3). Total num frames: 5529600. Throughput: 0: 290.8. Samples: 1383195. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 07:04:00,484][00034] Avg episode reward: [(0, '-4.158')] [2024-08-05 07:04:05,483][00034] Fps is (10 sec: 819.2, 60 sec: 1228.8, 300 sec: 1166.3). Total num frames: 5537792. Throughput: 0: 289.8. Samples: 1384941. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 07:04:05,484][00034] Avg episode reward: [(0, '-4.158')] [2024-08-05 07:04:07,491][00139] DAMAGECOUNT value on done: 47710.0 [2024-08-05 07:04:07,694][00139] DAMAGECOUNT value on done: 49799.0 [2024-08-05 07:04:07,695][00139] Sum rewards: -2.266, reward structure: {'DEATHCOUNT': '-9.000', 'HEALTH': '-1.236', 'AMMO5': '0.013', 'ARMOR': '0.016', 'AMMO2': '0.022', 'weapon5': '0.032', 'AMMO4': '0.110', 'AMMO3': '0.146', 'weapon4': '0.192', 'WEAPON4': '0.200', 'WEAPON5': '0.250', 'HITCOUNT': '0.300', 'WEAPON3': '0.850', 'weapon2': '1.018', 'DAMAGECOUNT': '1.065', 'weapon3': '1.756', 'FRAGCOUNT': '2.000'} [2024-08-05 07:04:10,483][00034] Fps is (10 sec: 1638.4, 60 sec: 1228.8, 300 sec: 1166.3). Total num frames: 5545984. Throughput: 0: 290.0. Samples: 1386705. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 07:04:10,485][00034] Avg episode reward: [(0, '-4.191')] [2024-08-05 07:04:15,483][00034] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1138.5). Total num frames: 5545984. Throughput: 0: 290.6. Samples: 1387597. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 07:04:15,484][00034] Avg episode reward: [(0, '-4.191')] [2024-08-05 07:04:20,483][00034] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1166.3). Total num frames: 5554176. Throughput: 0: 289.3. Samples: 1389318. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 07:04:20,484][00034] Avg episode reward: [(0, '-4.191')] [2024-08-05 07:04:21,942][00139] DAMAGECOUNT value on done: 47997.0 [2024-08-05 07:04:21,943][00139] Sum rewards: -3.179, reward structure: {'DEATHCOUNT': '-9.000', 'HEALTH': '-0.720', 'WEAPON1': '0.010', 'AMMO5': '0.012', 'AMMO2': '0.026', 'ARMOR': '0.032', 'weapon5': '0.070', 'weapon4': '0.108', 'AMMO4': '0.129', 'AMMO3': '0.155', 'WEAPON4': '0.200', 'HITCOUNT': '0.210', 'WEAPON5': '0.250', 'WEAPON3': '0.650', 'DAMAGECOUNT': '0.861', 'FRAGCOUNT': '1.000', 'weapon3': '1.358', 'weapon2': '1.470'} [2024-08-05 07:04:22,182][00139] DAMAGECOUNT value on done: 49879.0 [2024-08-05 07:04:25,483][00034] Fps is (10 sec: 1638.4, 60 sec: 1228.8, 300 sec: 1166.3). Total num frames: 5562368. Throughput: 0: 290.3. Samples: 1391085. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 07:04:25,484][00034] Avg episode reward: [(0, '-4.119')] [2024-08-05 07:04:30,483][00034] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1138.5). Total num frames: 5562368. Throughput: 0: 291.0. Samples: 1391968. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 07:04:30,485][00034] Avg episode reward: [(0, '-4.119')] [2024-08-05 07:04:30,573][00138] Updated weights for policy 0, policy_version 680 (0.0017) [2024-08-05 07:04:35,483][00034] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1166.3). Total num frames: 5570560. Throughput: 0: 292.3. Samples: 1393694. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 07:04:35,485][00034] Avg episode reward: [(0, '-4.119')] [2024-08-05 07:04:36,320][00139] DAMAGECOUNT value on done: 48142.0 [2024-08-05 07:04:36,321][00139] Sum rewards: -10.692, reward structure: {'DEATHCOUNT': '-13.500', 'HEALTH': '-4.220', 'AMMO2': '0.002', 'AMMO4': '0.008', 'AMMO5': '0.009', 'weapon5': '0.040', 'weapon4': '0.122', 'AMMO3': '0.129', 'WEAPON4': '0.150', 'HITCOUNT': '0.160', 'WEAPON5': '0.200', 'DAMAGECOUNT': '0.435', 'ARMOR': '0.455', 'WEAPON3': '0.750', 'weapon3': '1.170', 'FRAGCOUNT': '1.500', 'weapon2': '1.898'} [2024-08-05 07:04:36,536][00139] DAMAGECOUNT value on done: 50074.0 [2024-08-05 07:04:36,537][00139] Sum rewards: -5.618, reward structure: {'DEATHCOUNT': '-11.250', 'HEALTH': '-1.886', 'AMMO2': '0.009', 'AMMO4': '0.045', 'weapon4': '0.086', 'HITCOUNT': '0.130', 'WEAPON4': '0.150', 'AMMO3': '0.155', 'ARMOR': '0.453', 'DAMAGECOUNT': '0.585', 'WEAPON3': '0.900', 'weapon2': '1.138', 'weapon3': '1.866', 'FRAGCOUNT': '2.000'} [2024-08-05 07:04:40,483][00034] Fps is (10 sec: 1638.4, 60 sec: 1228.8, 300 sec: 1166.3). Total num frames: 5578752. Throughput: 0: 290.8. Samples: 1395421. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 07:04:40,484][00034] Avg episode reward: [(0, '-4.189')] [2024-08-05 07:04:45,483][00034] Fps is (10 sec: 1638.4, 60 sec: 1228.8, 300 sec: 1166.3). Total num frames: 5586944. Throughput: 0: 292.0. Samples: 1396336. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 07:04:45,485][00034] Avg episode reward: [(0, '-4.189')] [2024-08-05 07:04:50,483][00034] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1138.5). Total num frames: 5586944. Throughput: 0: 290.4. Samples: 1398009. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 07:04:50,485][00034] Avg episode reward: [(0, '-4.189')] [2024-08-05 07:04:50,875][00139] DAMAGECOUNT value on done: 48307.0 [2024-08-05 07:04:50,875][00139] Sum rewards: -3.778, reward structure: {'DEATHCOUNT': '-9.000', 'HEALTH': '-0.662', 'AMMO5': '0.012', 'AMMO2': '0.025', 'ARMOR': '0.060', 'weapon5': '0.086', 'AMMO3': '0.094', 'weapon4': '0.120', 'AMMO4': '0.124', 'HITCOUNT': '0.180', 'WEAPON4': '0.200', 'WEAPON5': '0.250', 'DAMAGECOUNT': '0.495', 'WEAPON3': '0.550', 'FRAGCOUNT': '1.000', 'weapon3': '1.248', 'weapon2': '1.440'} [2024-08-05 07:04:51,109][00139] DAMAGECOUNT value on done: 50218.0 [2024-08-05 07:04:51,110][00139] Sum rewards: -2.239, reward structure: {'DEATHCOUNT': '-6.750', 'HEALTH': '-1.108', 'AMMO5': '0.003', 'AMMO2': '0.017', 'WEAPON1': '0.020', 'ARMOR': '0.032', 'weapon4': '0.038', 'WEAPON5': '0.050', 'weapon5': '0.070', 'AMMO4': '0.085', 'AMMO3': '0.096', 'WEAPON4': '0.100', 'HITCOUNT': '0.130', 'DAMAGECOUNT': '0.432', 'WEAPON3': '0.650', 'FRAGCOUNT': '1.000', 'weapon2': '1.024', 'weapon3': '1.872'} [2024-08-05 07:04:55,483][00034] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1166.3). Total num frames: 5595136. Throughput: 0: 290.9. Samples: 1399796. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 07:04:55,484][00034] Avg episode reward: [(0, '-4.178')] [2024-08-05 07:05:00,483][00034] Fps is (10 sec: 1638.4, 60 sec: 1228.8, 300 sec: 1166.3). Total num frames: 5603328. Throughput: 0: 290.6. Samples: 1400674. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 07:05:00,486][00034] Avg episode reward: [(0, '-4.178')] [2024-08-05 07:05:05,288][00139] DAMAGECOUNT value on done: 48387.0 [2024-08-05 07:05:05,483][00034] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1138.5). Total num frames: 5603328. Throughput: 0: 290.7. Samples: 1402400. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 07:05:05,484][00034] Avg episode reward: [(0, '-4.157')] [2024-08-05 07:05:05,499][00139] DAMAGECOUNT value on done: 50414.0 [2024-08-05 07:05:05,499][00139] Sum rewards: -4.799, reward structure: {'DEATHCOUNT': '-10.500', 'HEALTH': '-2.272', 'AMMO5': '0.015', 'AMMO2': '0.016', 'WEAPON1': '0.020', 'WEAPON4': '0.050', 'weapon5': '0.058', 'weapon4': '0.058', 'AMMO4': '0.080', 'HITCOUNT': '0.140', 'AMMO3': '0.157', 'WEAPON5': '0.300', 'ARMOR': '0.400', 'DAMAGECOUNT': '0.588', 'WEAPON3': '0.800', 'weapon2': '1.370', 'weapon3': '1.920', 'FRAGCOUNT': '2.000'} [2024-08-05 07:05:10,483][00034] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1166.3). Total num frames: 5611520. Throughput: 0: 290.4. Samples: 1404151. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 07:05:10,484][00034] Avg episode reward: [(0, '-4.128')] [2024-08-05 07:05:15,483][00034] Fps is (10 sec: 1638.4, 60 sec: 1228.8, 300 sec: 1166.3). Total num frames: 5619712. Throughput: 0: 290.2. Samples: 1405027. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 07:05:15,484][00034] Avg episode reward: [(0, '-4.128')] [2024-08-05 07:05:19,906][00139] DAMAGECOUNT value on done: 48511.0 [2024-08-05 07:05:19,906][00139] Sum rewards: -5.256, reward structure: {'DEATHCOUNT': '-9.000', 'HEALTH': '-1.416', 'AMMO2': '0.005', 'AMMO5': '0.009', 'WEAPON1': '0.010', 'weapon5': '0.022', 'AMMO4': '0.024', 'ARMOR': '0.080', 'HITCOUNT': '0.090', 'AMMO3': '0.104', 'WEAPON5': '0.200', 'DAMAGECOUNT': '0.372', 'FRAGCOUNT': '0.500', 'WEAPON3': '0.650', 'weapon3': '1.434', 'weapon2': '1.660'} [2024-08-05 07:05:20,215][00139] DAMAGECOUNT value on done: 50899.0 [2024-08-05 07:05:20,216][00139] Sum rewards: 3.451, reward structure: {'DEATHCOUNT': '-5.250', 'HEALTH': '-1.164', 'AMMO4': '-0.031', 'AMMO2': '-0.006', 'AMMO5': '0.017', 'weapon5': '0.048', 'weapon7': '0.060', 'AMMO3': '0.076', 'AMMO6': '0.100', 'AMMO7': '0.100', 'WEAPON7': '0.100', 'WEAPON5': '0.250', 'HITCOUNT': '0.320', 'WEAPON3': '0.500', 'weapon3': '1.390', 'DAMAGECOUNT': '1.455', 'weapon2': '1.486', 'FRAGCOUNT': '4.000'} [2024-08-05 07:05:20,483][00034] Fps is (10 sec: 1638.4, 60 sec: 1228.8, 300 sec: 1166.3). Total num frames: 5627904. Throughput: 0: 289.8. Samples: 1406733. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 07:05:20,485][00034] Avg episode reward: [(0, '-4.061')] [2024-08-05 07:05:25,483][00034] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1138.5). Total num frames: 5627904. Throughput: 0: 288.3. Samples: 1408394. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 07:05:25,484][00034] Avg episode reward: [(0, '-4.061')] [2024-08-05 07:05:30,483][00034] Fps is (10 sec: 819.2, 60 sec: 1228.8, 300 sec: 1166.3). Total num frames: 5636096. Throughput: 0: 286.9. Samples: 1409248. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 07:05:30,484][00034] Avg episode reward: [(0, '-4.061')] [2024-08-05 07:05:34,715][00139] DAMAGECOUNT value on done: 48725.0 [2024-08-05 07:05:34,716][00139] Sum rewards: -7.065, reward structure: {'DEATHCOUNT': '-12.000', 'HEALTH': '-1.294', 'ARMOR': '0.004', 'AMMO5': '0.005', 'AMMO2': '0.007', 'AMMO4': '0.036', 'weapon5': '0.038', 'WEAPON5': '0.050', 'AMMO3': '0.162', 'WEAPON4': '0.200', 'HITCOUNT': '0.250', 'weapon4': '0.412', 'DAMAGECOUNT': '0.642', 'WEAPON3': '0.950', 'weapon2': '0.998', 'FRAGCOUNT': '1.000', 'weapon3': '1.474'} [2024-08-05 07:05:34,931][00139] DAMAGECOUNT value on done: 51220.0 [2024-08-05 07:05:34,931][00139] Sum rewards: -5.829, reward structure: {'DEATHCOUNT': '-11.250', 'HEALTH': '-1.572', 'AMMO2': '0.015', 'ARMOR': '0.032', 'weapon7': '0.064', 'AMMO4': '0.077', 'WEAPON4': '0.100', 'weapon4': '0.106', 'AMMO6': '0.120', 'AMMO7': '0.120', 'AMMO3': '0.130', 'HITCOUNT': '0.200', 'WEAPON7': '0.200', 'WEAPON3': '0.800', 'DAMAGECOUNT': '0.963', 'FRAGCOUNT': '1.000', 'weapon2': '1.296', 'weapon3': '1.770'} [2024-08-05 07:05:35,483][00034] Fps is (10 sec: 1638.4, 60 sec: 1228.8, 300 sec: 1166.3). Total num frames: 5644288. Throughput: 0: 288.1. Samples: 1410974. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 07:05:35,485][00034] Avg episode reward: [(0, '-4.018')] [2024-08-05 07:05:35,492][00132] Saving /kaggle/working/train_dir/default_experiment/checkpoint_p0/checkpoint_000000689_5644288.pth... [2024-08-05 07:05:35,578][00132] Removing /kaggle/working/train_dir/default_experiment/checkpoint_p0/checkpoint_000000655_5365760.pth [2024-08-05 07:05:40,483][00034] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1138.5). Total num frames: 5644288. Throughput: 0: 287.2. Samples: 1412721. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 07:05:40,484][00034] Avg episode reward: [(0, '-4.018')] [2024-08-05 07:05:41,625][00138] Updated weights for policy 0, policy_version 690 (0.0018) [2024-08-05 07:05:45,483][00034] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1166.3). Total num frames: 5652480. Throughput: 0: 286.8. Samples: 1413579. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 07:05:45,484][00034] Avg episode reward: [(0, '-4.018')] [2024-08-05 07:05:49,282][00139] DAMAGECOUNT value on done: 48930.0 [2024-08-05 07:05:49,282][00139] Sum rewards: -4.604, reward structure: {'DEATHCOUNT': '-8.250', 'HEALTH': '-1.455', 'AMMO4': '-0.018', 'AMMO2': '-0.004', 'AMMO5': '0.009', 'ARMOR': '0.024', 'weapon5': '0.036', 'HITCOUNT': '0.110', 'AMMO3': '0.116', 'WEAPON5': '0.200', 'WEAPON3': '0.550', 'DAMAGECOUNT': '0.615', 'weapon3': '0.868', 'FRAGCOUNT': '1.000', 'weapon2': '1.594'} [2024-08-05 07:05:49,506][00139] DAMAGECOUNT value on done: 51381.0 [2024-08-05 07:05:49,507][00139] Sum rewards: -4.217, reward structure: {'DEATHCOUNT': '-9.750', 'HEALTH': '-1.272', 'AMMO5': '0.007', 'AMMO2': '0.027', 'ARMOR': '0.080', 'weapon7': '0.088', 'AMMO3': '0.116', 'AMMO6': '0.120', 'AMMO7': '0.120', 'weapon5': '0.128', 'AMMO4': '0.134', 'WEAPON5': '0.150', 'HITCOUNT': '0.180', 'WEAPON4': '0.200', 'WEAPON7': '0.200', 'weapon4': '0.254', 'DAMAGECOUNT': '0.483', 'WEAPON3': '0.650', 'FRAGCOUNT': '1.000', 'weapon3': '1.042', 'weapon2': '1.826'} [2024-08-05 07:05:50,483][00034] Fps is (10 sec: 1638.4, 60 sec: 1228.8, 300 sec: 1166.3). Total num frames: 5660672. Throughput: 0: 286.5. Samples: 1415291. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 07:05:50,485][00034] Avg episode reward: [(0, '-4.069')] [2024-08-05 07:05:55,483][00034] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1138.5). Total num frames: 5660672. Throughput: 0: 285.9. Samples: 1417018. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 07:05:55,484][00034] Avg episode reward: [(0, '-4.069')] [2024-08-05 07:06:00,483][00034] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1166.3). Total num frames: 5668864. Throughput: 0: 285.5. Samples: 1417875. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 07:06:00,484][00034] Avg episode reward: [(0, '-4.069')] [2024-08-05 07:06:03,929][00139] DAMAGECOUNT value on done: 49112.0 [2024-08-05 07:06:03,930][00139] Sum rewards: -1.345, reward structure: {'DEATHCOUNT': '-8.250', 'HEALTH': '-0.066', 'AMMO5': '0.005', 'WEAPON1': '0.010', 'AMMO2': '0.020', 'ARMOR': '0.036', 'weapon4': '0.056', 'WEAPON5': '0.100', 'WEAPON4': '0.100', 'AMMO4': '0.101', 'AMMO3': '0.117', 'HITCOUNT': '0.150', 'DAMAGECOUNT': '0.546', 'WEAPON3': '0.600', 'weapon2': '1.050', 'FRAGCOUNT': '2.000', 'weapon3': '2.080'} [2024-08-05 07:06:04,144][00139] DAMAGECOUNT value on done: 51644.0 [2024-08-05 07:06:04,144][00139] Sum rewards: -5.736, reward structure: {'DEATHCOUNT': '-9.750', 'HEALTH': '-1.390', 'FRAGCOUNT': '-0.500', 'AMMO5': '0.007', 'weapon5': '0.020', 'AMMO2': '0.021', 'ARMOR': '0.028', 'WEAPON5': '0.050', 'AMMO4': '0.105', 'AMMO3': '0.147', 'HITCOUNT': '0.260', 'WEAPON4': '0.300', 'weapon4': '0.300', 'DAMAGECOUNT': '0.789', 'WEAPON3': '0.850', 'weapon2': '1.358', 'weapon3': '1.668'} [2024-08-05 07:06:05,483][00034] Fps is (10 sec: 1638.4, 60 sec: 1228.8, 300 sec: 1166.3). Total num frames: 5677056. Throughput: 0: 285.8. Samples: 1419596. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 07:06:05,484][00034] Avg episode reward: [(0, '-3.952')] [2024-08-05 07:06:10,483][00034] Fps is (10 sec: 1638.4, 60 sec: 1228.8, 300 sec: 1166.3). Total num frames: 5685248. Throughput: 0: 287.3. Samples: 1421322. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 07:06:10,484][00034] Avg episode reward: [(0, '-3.952')] [2024-08-05 07:06:15,483][00034] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1138.5). Total num frames: 5685248. Throughput: 0: 287.5. Samples: 1422187. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 07:06:15,485][00034] Avg episode reward: [(0, '-3.952')] [2024-08-05 07:06:18,368][00139] DAMAGECOUNT value on done: 49397.0 [2024-08-05 07:06:18,368][00139] Sum rewards: -3.380, reward structure: {'DEATHCOUNT': '-9.750', 'HEALTH': '-0.368', 'WEAPON1': '0.010', 'AMMO5': '0.015', 'AMMO2': '0.023', 'weapon5': '0.092', 'AMMO3': '0.096', 'AMMO4': '0.116', 'WEAPON4': '0.150', 'weapon4': '0.188', 'HITCOUNT': '0.230', 'WEAPON5': '0.250', 'WEAPON3': '0.450', 'ARMOR': '0.453', 'DAMAGECOUNT': '0.855', 'FRAGCOUNT': '1.000', 'weapon3': '1.024', 'weapon2': '1.786'} [2024-08-05 07:06:18,607][00139] DAMAGECOUNT value on done: 51794.0 [2024-08-05 07:06:20,483][00034] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1166.3). Total num frames: 5693440. Throughput: 0: 288.1. Samples: 1423939. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 07:06:20,484][00034] Avg episode reward: [(0, '-3.926')] [2024-08-05 07:06:25,483][00034] Fps is (10 sec: 1638.4, 60 sec: 1228.8, 300 sec: 1166.3). Total num frames: 5701632. Throughput: 0: 285.2. Samples: 1425553. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 07:06:25,484][00034] Avg episode reward: [(0, '-3.926')] [2024-08-05 07:06:30,483][00034] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1138.5). Total num frames: 5701632. Throughput: 0: 285.0. Samples: 1426404. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 07:06:30,485][00034] Avg episode reward: [(0, '-3.926')] [2024-08-05 07:06:33,330][00139] DAMAGECOUNT value on done: 49637.0 [2024-08-05 07:06:33,331][00139] Sum rewards: -5.726, reward structure: {'DEATHCOUNT': '-12.750', 'HEALTH': '-0.860', 'AMMO5': '0.003', 'weapon5': '0.008', 'AMMO2': '0.030', 'WEAPON5': '0.050', 'weapon4': '0.096', 'AMMO4': '0.151', 'AMMO3': '0.174', 'HITCOUNT': '0.200', 'WEAPON4': '0.250', 'DAMAGECOUNT': '0.720', 'WEAPON3': '0.950', 'weapon2': '1.438', 'weapon3': '1.814', 'FRAGCOUNT': '2.000'} [2024-08-05 07:06:33,557][00139] DAMAGECOUNT value on done: 52010.0 [2024-08-05 07:06:33,557][00139] Sum rewards: -4.237, reward structure: {'DEATHCOUNT': '-11.250', 'HEALTH': '-0.466', 'AMMO2': '0.017', 'ARMOR': '0.032', 'weapon7': '0.082', 'AMMO4': '0.084', 'AMMO6': '0.100', 'AMMO7': '0.100', 'WEAPON7': '0.100', 'WEAPON4': '0.100', 'AMMO3': '0.118', 'weapon4': '0.156', 'HITCOUNT': '0.180', 'DAMAGECOUNT': '0.648', 'WEAPON3': '0.700', 'weapon3': '1.228', 'weapon2': '1.834', 'FRAGCOUNT': '2.000'} [2024-08-05 07:06:35,483][00034] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1166.3). Total num frames: 5709824. Throughput: 0: 286.0. Samples: 1428161. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 07:06:35,484][00034] Avg episode reward: [(0, '-3.983')] [2024-08-05 07:06:40,483][00034] Fps is (10 sec: 1638.4, 60 sec: 1228.8, 300 sec: 1166.3). Total num frames: 5718016. Throughput: 0: 282.6. Samples: 1429736. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 07:06:40,485][00034] Avg episode reward: [(0, '-3.983')] [2024-08-05 07:06:45,483][00034] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1138.5). Total num frames: 5718016. Throughput: 0: 282.4. Samples: 1430581. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 07:06:45,484][00034] Avg episode reward: [(0, '-3.983')] [2024-08-05 07:06:48,570][00139] DAMAGECOUNT value on done: 49881.0 [2024-08-05 07:06:48,571][00139] Sum rewards: -5.773, reward structure: {'DEATHCOUNT': '-11.250', 'HEALTH': '-1.370', 'AMMO2': '0.010', 'ARMOR': '0.032', 'AMMO4': '0.050', 'weapon7': '0.070', 'AMMO6': '0.120', 'AMMO7': '0.120', 'AMMO3': '0.138', 'WEAPON4': '0.150', 'HITCOUNT': '0.200', 'WEAPON7': '0.200', 'weapon4': '0.330', 'DAMAGECOUNT': '0.732', 'WEAPON3': '0.850', 'FRAGCOUNT': '1.000', 'weapon2': '1.294', 'weapon3': '1.550'} [2024-08-05 07:06:48,813][00139] DAMAGECOUNT value on done: 52316.0 [2024-08-05 07:06:48,814][00139] Sum rewards: -1.232, reward structure: {'DEATHCOUNT': '-8.250', 'HEALTH': '-0.558', 'AMMO5': '0.003', 'AMMO2': '0.012', 'WEAPON1': '0.030', 'weapon5': '0.054', 'AMMO4': '0.057', 'weapon4': '0.066', 'ARMOR': '0.076', 'WEAPON5': '0.100', 'AMMO3': '0.114', 'WEAPON4': '0.150', 'HITCOUNT': '0.170', 'WEAPON3': '0.550', 'DAMAGECOUNT': '0.918', 'weapon2': '1.526', 'weapon3': '1.750', 'FRAGCOUNT': '2.000'} [2024-08-05 07:06:50,483][00034] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1138.5). Total num frames: 5726208. Throughput: 0: 281.9. Samples: 1432281. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 07:06:50,485][00034] Avg episode reward: [(0, '-3.978')] [2024-08-05 07:06:53,780][00138] Updated weights for policy 0, policy_version 700 (0.0019) [2024-08-05 07:06:55,483][00034] Fps is (10 sec: 1638.4, 60 sec: 1228.8, 300 sec: 1166.3). Total num frames: 5734400. Throughput: 0: 281.3. Samples: 1433982. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 07:06:55,485][00034] Avg episode reward: [(0, '-3.978')] [2024-08-05 07:07:00,483][00034] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1138.5). Total num frames: 5734400. Throughput: 0: 280.9. Samples: 1434826. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 07:07:00,484][00034] Avg episode reward: [(0, '-3.978')] [2024-08-05 07:07:03,386][00139] DAMAGECOUNT value on done: 50215.0 [2024-08-05 07:07:03,386][00139] Sum rewards: 0.694, reward structure: {'DEATHCOUNT': '-6.750', 'HEALTH': '-0.146', 'AMMO5': '0.009', 'AMMO2': '0.023', 'AMMO3': '0.080', 'ARMOR': '0.096', 'AMMO4': '0.112', 'weapon5': '0.120', 'HITCOUNT': '0.170', 'WEAPON5': '0.200', 'WEAPON4': '0.250', 'weapon4': '0.302', 'WEAPON3': '0.550', 'DAMAGECOUNT': '1.002', 'weapon2': '1.224', 'weapon3': '1.452', 'FRAGCOUNT': '2.000'} [2024-08-05 07:07:03,612][00139] DAMAGECOUNT value on done: 52965.0 [2024-08-05 07:07:03,612][00139] Sum rewards: -0.563, reward structure: {'DEATHCOUNT': '-9.000', 'HEALTH': '-0.776', 'WEAPON1': '0.010', 'AMMO5': '0.010', 'AMMO2': '0.026', 'AMMO3': '0.063', 'weapon5': '0.128', 'AMMO4': '0.132', 'WEAPON4': '0.200', 'WEAPON3': '0.250', 'WEAPON5': '0.250', 'HITCOUNT': '0.260', 'weapon4': '0.312', 'ARMOR': '0.472', 'weapon3': '0.794', 'weapon2': '1.858', 'DAMAGECOUNT': '1.947', 'FRAGCOUNT': '2.500'} [2024-08-05 07:07:05,483][00034] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1138.5). Total num frames: 5742592. Throughput: 0: 280.0. Samples: 1436540. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 07:07:05,487][00034] Avg episode reward: [(0, '-3.905')] [2024-08-05 07:07:10,483][00034] Fps is (10 sec: 1638.4, 60 sec: 1092.3, 300 sec: 1166.3). Total num frames: 5750784. Throughput: 0: 283.1. Samples: 1438293. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 07:07:10,484][00034] Avg episode reward: [(0, '-3.905')] [2024-08-05 07:07:15,483][00034] Fps is (10 sec: 1638.4, 60 sec: 1228.8, 300 sec: 1166.3). Total num frames: 5758976. Throughput: 0: 284.4. Samples: 1439204. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 07:07:15,484][00034] Avg episode reward: [(0, '-3.905')] [2024-08-05 07:07:17,921][00139] DAMAGECOUNT value on done: 50539.0 [2024-08-05 07:07:17,922][00139] Sum rewards: -3.548, reward structure: {'DEATHCOUNT': '-7.500', 'HEALTH': '-1.258', 'FRAGCOUNT': '-1.000', 'ARMOR': '0.004', 'AMMO2': '0.009', 'AMMO5': '0.022', 'AMMO4': '0.044', 'weapon7': '0.078', 'weapon4': '0.096', 'WEAPON4': '0.100', 'weapon5': '0.104', 'AMMO6': '0.120', 'AMMO7': '0.120', 'AMMO3': '0.130', 'HITCOUNT': '0.140', 'WEAPON7': '0.200', 'WEAPON5': '0.350', 'WEAPON3': '0.700', 'DAMAGECOUNT': '0.972', 'weapon2': '1.294', 'weapon3': '1.726'} [2024-08-05 07:07:18,159][00139] DAMAGECOUNT value on done: 53259.0 [2024-08-05 07:07:18,159][00139] Sum rewards: -0.254, reward structure: {'DEATHCOUNT': '-9.000', 'weapon5': '0.014', 'AMMO5': '0.015', 'AMMO2': '0.025', 'HEALTH': '0.031', 'ARMOR': '0.044', 'AMMO4': '0.122', 'AMMO3': '0.131', 'WEAPON4': '0.150', 'WEAPON5': '0.200', 'HITCOUNT': '0.220', 'weapon4': '0.278', 'WEAPON3': '0.650', 'DAMAGECOUNT': '0.882', 'weapon2': '1.264', 'weapon3': '1.720', 'FRAGCOUNT': '3.000'} [2024-08-05 07:07:20,483][00034] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1138.5). Total num frames: 5758976. Throughput: 0: 282.5. Samples: 1440873. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 07:07:20,484][00034] Avg episode reward: [(0, '-3.849')] [2024-08-05 07:07:25,483][00034] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1166.3). Total num frames: 5767168. Throughput: 0: 285.2. Samples: 1442571. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 07:07:25,484][00034] Avg episode reward: [(0, '-3.849')] [2024-08-05 07:07:30,483][00034] Fps is (10 sec: 1638.4, 60 sec: 1228.8, 300 sec: 1166.3). Total num frames: 5775360. Throughput: 0: 285.4. Samples: 1443424. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 07:07:30,484][00034] Avg episode reward: [(0, '-3.849')] [2024-08-05 07:07:32,728][00139] DAMAGECOUNT value on done: 50712.0 [2024-08-05 07:07:32,728][00139] Sum rewards: -0.195, reward structure: {'DEATHCOUNT': '-5.250', 'HEALTH': '-0.669', 'AMMO5': '0.003', 'WEAPON1': '0.010', 'AMMO2': '0.012', 'weapon5': '0.056', 'AMMO4': '0.057', 'HITCOUNT': '0.090', 'WEAPON4': '0.100', 'WEAPON5': '0.100', 'AMMO3': '0.111', 'weapon4': '0.148', 'ARMOR': '0.404', 'DAMAGECOUNT': '0.519', 'WEAPON3': '0.550', 'weapon2': '0.956', 'FRAGCOUNT': '1.000', 'weapon3': '1.608'} [2024-08-05 07:07:32,976][00139] DAMAGECOUNT value on done: 53554.0 [2024-08-05 07:07:32,977][00139] Sum rewards: -4.748, reward structure: {'DEATHCOUNT': '-9.750', 'HEALTH': '-1.795', 'AMMO5': '0.015', 'WEAPON1': '0.020', 'AMMO2': '0.025', 'ARMOR': '0.028', 'weapon5': '0.084', 'AMMO3': '0.107', 'AMMO4': '0.123', 'HITCOUNT': '0.210', 'WEAPON5': '0.250', 'WEAPON4': '0.300', 'weapon4': '0.454', 'FRAGCOUNT': '0.500', 'WEAPON3': '0.700', 'DAMAGECOUNT': '0.885', 'weapon3': '1.356', 'weapon2': '1.740'} [2024-08-05 07:07:35,483][00034] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1138.5). Total num frames: 5775360. Throughput: 0: 285.6. Samples: 1445135. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 07:07:35,485][00034] Avg episode reward: [(0, '-3.851')] [2024-08-05 07:07:35,496][00132] Saving /kaggle/working/train_dir/default_experiment/checkpoint_p0/checkpoint_000000705_5775360.pth... [2024-08-05 07:07:35,579][00132] Removing /kaggle/working/train_dir/default_experiment/checkpoint_p0/checkpoint_000000672_5505024.pth [2024-08-05 07:07:40,483][00034] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1138.5). Total num frames: 5783552. Throughput: 0: 285.7. Samples: 1446838. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 07:07:40,484][00034] Avg episode reward: [(0, '-3.851')] [2024-08-05 07:07:45,484][00034] Fps is (10 sec: 1638.4, 60 sec: 1228.8, 300 sec: 1166.3). Total num frames: 5791744. Throughput: 0: 286.8. Samples: 1447732. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 07:07:45,486][00034] Avg episode reward: [(0, '-3.851')] [2024-08-05 07:07:47,362][00139] DAMAGECOUNT value on done: 50982.0 [2024-08-05 07:07:47,362][00139] Sum rewards: -5.376, reward structure: {'DEATHCOUNT': '-13.500', 'HEALTH': '-0.236', 'AMMO5': '0.014', 'AMMO2': '0.024', 'weapon5': '0.034', 'ARMOR': '0.080', 'AMMO4': '0.121', 'AMMO3': '0.152', 'WEAPON5': '0.200', 'HITCOUNT': '0.280', 'weapon4': '0.282', 'WEAPON4': '0.300', 'WEAPON3': '0.800', 'DAMAGECOUNT': '0.810', 'weapon2': '1.576', 'weapon3': '1.686', 'FRAGCOUNT': '2.000'} [2024-08-05 07:07:47,591][00139] DAMAGECOUNT value on done: 53959.0 [2024-08-05 07:07:47,591][00139] Sum rewards: 1.988, reward structure: {'DEATHCOUNT': '-6.000', 'HEALTH': '-0.810', 'AMMO2': '0.009', 'AMMO5': '0.010', 'WEAPON1': '0.010', 'weapon5': '0.036', 'ARMOR': '0.045', 'AMMO4': '0.047', 'AMMO3': '0.068', 'WEAPON5': '0.150', 'HITCOUNT': '0.330', 'WEAPON3': '0.400', 'DAMAGECOUNT': '1.215', 'weapon3': '1.460', 'weapon2': '1.518', 'FRAGCOUNT': '3.500'} [2024-08-05 07:07:50,483][00034] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1138.5). Total num frames: 5791744. Throughput: 0: 286.2. Samples: 1449418. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 07:07:50,484][00034] Avg episode reward: [(0, '-3.773')] [2024-08-05 07:07:55,483][00034] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1138.5). Total num frames: 5799936. Throughput: 0: 285.4. Samples: 1451135. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 07:07:55,485][00034] Avg episode reward: [(0, '-3.773')] [2024-08-05 07:08:00,483][00034] Fps is (10 sec: 1638.4, 60 sec: 1228.8, 300 sec: 1166.3). Total num frames: 5808128. Throughput: 0: 283.5. Samples: 1451961. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 07:08:00,484][00034] Avg episode reward: [(0, '-3.773')] [2024-08-05 07:08:02,223][00139] DAMAGECOUNT value on done: 51242.0 [2024-08-05 07:08:02,223][00139] Sum rewards: -2.768, reward structure: {'DEATHCOUNT': '-9.750', 'HEALTH': '-2.227', 'AMMO5': '0.003', 'weapon5': '0.012', 'AMMO2': '0.019', 'weapon7': '0.048', 'WEAPON5': '0.050', 'ARMOR': '0.092', 'AMMO4': '0.094', 'AMMO3': '0.141', 'HITCOUNT': '0.260', 'WEAPON4': '0.300', 'AMMO6': '0.360', 'AMMO7': '0.360', 'WEAPON7': '0.400', 'weapon4': '0.482', 'DAMAGECOUNT': '0.780', 'WEAPON3': '0.950', 'weapon2': '1.212', 'weapon3': '1.646', 'FRAGCOUNT': '2.000'} [2024-08-05 07:08:02,475][00139] DAMAGECOUNT value on done: 54366.0 [2024-08-05 07:08:02,476][00139] Sum rewards: -3.412, reward structure: {'DEATHCOUNT': '-10.500', 'HEALTH': '-2.859', 'AMMO2': '0.006', 'AMMO5': '0.014', 'AMMO4': '0.028', 'weapon5': '0.044', 'AMMO3': '0.188', 'WEAPON4': '0.200', 'weapon4': '0.220', 'WEAPON5': '0.250', 'HITCOUNT': '0.310', 'ARMOR': '0.885', 'weapon2': '1.128', 'DAMAGECOUNT': '1.221', 'WEAPON3': '1.250', 'FRAGCOUNT': '2.000', 'weapon3': '2.202'} [2024-08-05 07:08:05,483][00034] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1138.5). Total num frames: 5808128. Throughput: 0: 284.8. Samples: 1453691. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 07:08:05,484][00034] Avg episode reward: [(0, '-3.732')] [2024-08-05 07:08:05,578][00138] Updated weights for policy 0, policy_version 710 (0.0017) [2024-08-05 07:08:10,483][00034] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1138.5). Total num frames: 5816320. Throughput: 0: 284.4. Samples: 1455371. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 07:08:10,485][00034] Avg episode reward: [(0, '-3.732')] [2024-08-05 07:08:15,469][00139] Large shaping reward 2.652 for [('FRAGCOUNT', 2.0, 2.0), ('HITCOUNT', 0.05, 5.0), ('DAMAGECOUNT', 0.6, 200), ('weapon2', 0.002)] [2024-08-05 07:08:15,483][00034] Fps is (10 sec: 1638.4, 60 sec: 1092.3, 300 sec: 1138.5). Total num frames: 5824512. Throughput: 0: 284.7. Samples: 1456234. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 07:08:15,484][00034] Avg episode reward: [(0, '-3.732')] [2024-08-05 07:08:17,055][00139] DAMAGECOUNT value on done: 51743.0 [2024-08-05 07:08:17,055][00139] Sum rewards: -4.189, reward structure: {'DEATHCOUNT': '-10.500', 'HEALTH': '-1.394', 'AMMO2': '0.002', 'AMMO5': '0.007', 'AMMO4': '0.010', 'weapon5': '0.042', 'weapon7': '0.086', 'AMMO6': '0.100', 'AMMO7': '0.100', 'WEAPON7': '0.100', 'WEAPON5': '0.100', 'HITCOUNT': '0.160', 'AMMO3': '0.198', 'WEAPON3': '0.900', 'DAMAGECOUNT': '1.095', 'weapon2': '1.350', 'FRAGCOUNT': '1.500', 'weapon3': '1.954'} [2024-08-05 07:08:17,276][00139] DAMAGECOUNT value on done: 54758.0 [2024-08-05 07:08:17,276][00139] Sum rewards: -1.372, reward structure: {'DEATHCOUNT': '-6.750', 'HEALTH': '-2.225', 'AMMO4': '-0.011', 'AMMO2': '-0.002', 'AMMO5': '0.013', 'weapon5': '0.040', 'WEAPON5': '0.100', 'AMMO3': '0.129', 'weapon7': '0.174', 'WEAPON4': '0.200', 'AMMO6': '0.220', 'AMMO7': '0.220', 'HITCOUNT': '0.270', 'WEAPON7': '0.300', 'weapon4': '0.446', 'weapon2': '0.616', 'WEAPON3': '0.850', 'FRAGCOUNT': '1.000', 'DAMAGECOUNT': '1.176', 'weapon3': '1.862'} [2024-08-05 07:08:20,483][00034] Fps is (10 sec: 1638.4, 60 sec: 1228.8, 300 sec: 1166.3). Total num frames: 5832704. Throughput: 0: 284.4. Samples: 1457932. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 07:08:20,484][00034] Avg episode reward: [(0, '-3.679')] [2024-08-05 07:08:25,483][00034] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1138.5). Total num frames: 5832704. Throughput: 0: 285.8. Samples: 1459700. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 07:08:25,485][00034] Avg episode reward: [(0, '-3.679')] [2024-08-05 07:08:30,483][00034] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1138.5). Total num frames: 5840896. Throughput: 0: 284.9. Samples: 1460554. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 07:08:30,484][00034] Avg episode reward: [(0, '-3.679')] [2024-08-05 07:08:31,670][00139] DAMAGECOUNT value on done: 52152.0 [2024-08-05 07:08:31,670][00139] Sum rewards: 1.838, reward structure: {'DEATHCOUNT': '-7.500', 'AMMO2': '0.008', 'WEAPON1': '0.010', 'AMMO5': '0.012', 'AMMO4': '0.039', 'weapon5': '0.092', 'AMMO3': '0.133', 'WEAPON5': '0.200', 'HEALTH': '0.266', 'HITCOUNT': '0.280', 'WEAPON3': '0.700', 'weapon2': '1.166', 'DAMAGECOUNT': '1.227', 'weapon3': '2.204', 'FRAGCOUNT': '3.000'} [2024-08-05 07:08:31,900][00139] DAMAGECOUNT value on done: 55113.0 [2024-08-05 07:08:31,900][00139] Sum rewards: -3.964, reward structure: {'DEATHCOUNT': '-11.250', 'HEALTH': '-0.522', 'WEAPON1': '0.010', 'AMMO5': '0.012', 'weapon5': '0.014', 'AMMO2': '0.015', 'AMMO4': '0.074', 'ARMOR': '0.080', 'WEAPON4': '0.100', 'AMMO3': '0.142', 'WEAPON5': '0.200', 'weapon4': '0.244', 'HITCOUNT': '0.260', 'WEAPON3': '0.850', 'DAMAGECOUNT': '1.065', 'weapon2': '1.198', 'FRAGCOUNT': '1.500', 'weapon3': '2.044'} [2024-08-05 07:08:35,483][00034] Fps is (10 sec: 1638.4, 60 sec: 1228.8, 300 sec: 1166.3). Total num frames: 5849088. Throughput: 0: 285.5. Samples: 1462264. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 07:08:35,485][00034] Avg episode reward: [(0, '-3.645')] [2024-08-05 07:08:40,483][00034] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1138.5). Total num frames: 5849088. Throughput: 0: 285.7. Samples: 1463991. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 07:08:40,484][00034] Avg episode reward: [(0, '-3.645')] [2024-08-05 07:08:45,483][00034] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1138.5). Total num frames: 5857280. Throughput: 0: 286.7. Samples: 1464861. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 07:08:45,484][00034] Avg episode reward: [(0, '-3.645')] [2024-08-05 07:08:46,047][00139] DAMAGECOUNT value on done: 52441.0 [2024-08-05 07:08:46,047][00139] Sum rewards: -3.428, reward structure: {'DEATHCOUNT': '-9.750', 'HEALTH': '-0.280', 'AMMO5': '0.010', 'WEAPON1': '0.020', 'AMMO2': '0.022', 'weapon5': '0.046', 'AMMO3': '0.085', 'AMMO4': '0.111', 'WEAPON5': '0.200', 'HITCOUNT': '0.240', 'WEAPON4': '0.300', 'weapon4': '0.354', 'WEAPON3': '0.500', 'DAMAGECOUNT': '0.867', 'FRAGCOUNT': '1.000', 'weapon3': '1.306', 'weapon2': '1.540'} [2024-08-05 07:08:46,278][00139] DAMAGECOUNT value on done: 55441.0 [2024-08-05 07:08:46,279][00139] Sum rewards: -1.798, reward structure: {'DEATHCOUNT': '-9.000', 'HEALTH': '-1.451', 'AMMO5': '0.004', 'ARMOR': '0.008', 'AMMO2': '0.023', 'weapon4': '0.032', 'weapon5': '0.038', 'WEAPON1': '0.050', 'WEAPON5': '0.100', 'WEAPON4': '0.100', 'AMMO4': '0.115', 'AMMO3': '0.131', 'HITCOUNT': '0.210', 'WEAPON3': '0.800', 'DAMAGECOUNT': '0.984', 'weapon2': '1.030', 'weapon3': '2.028', 'FRAGCOUNT': '3.000'} [2024-08-05 07:08:50,483][00034] Fps is (10 sec: 1638.4, 60 sec: 1228.8, 300 sec: 1138.5). Total num frames: 5865472. Throughput: 0: 286.6. Samples: 1466586. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 07:08:50,485][00034] Avg episode reward: [(0, '-3.520')] [2024-08-05 07:08:55,483][00034] Fps is (10 sec: 1638.4, 60 sec: 1228.8, 300 sec: 1166.3). Total num frames: 5873664. Throughput: 0: 289.0. Samples: 1468374. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 07:08:55,484][00034] Avg episode reward: [(0, '-3.520')] [2024-08-05 07:09:00,483][00034] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1138.5). Total num frames: 5873664. Throughput: 0: 289.2. Samples: 1469246. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 07:09:00,484][00034] Avg episode reward: [(0, '-3.520')] [2024-08-05 07:09:00,671][00139] DAMAGECOUNT value on done: 52833.0 [2024-08-05 07:09:00,671][00139] Sum rewards: 1.528, reward structure: {'DEATHCOUNT': '-4.500', 'HEALTH': '-0.956', 'AMMO5': '0.003', 'AMMO2': '0.005', 'weapon5': '0.006', 'ARMOR': '0.020', 'AMMO4': '0.024', 'WEAPON5': '0.050', 'weapon7': '0.062', 'AMMO3': '0.088', 'WEAPON4': '0.100', 'AMMO6': '0.120', 'AMMO7': '0.120', 'weapon4': '0.168', 'WEAPON7': '0.200', 'HITCOUNT': '0.260', 'WEAPON3': '0.500', 'FRAGCOUNT': '1.000', 'DAMAGECOUNT': '1.176', 'weapon2': '1.266', 'weapon3': '1.816'} [2024-08-05 07:09:00,908][00139] DAMAGECOUNT value on done: 55666.0 [2024-08-05 07:09:00,909][00139] Sum rewards: -4.317, reward structure: {'DEATHCOUNT': '-9.750', 'HEALTH': '-1.568', 'AMMO2': '0.020', 'AMMO5': '0.028', 'weapon4': '0.054', 'weapon5': '0.078', 'AMMO4': '0.097', 'WEAPON4': '0.100', 'AMMO3': '0.119', 'HITCOUNT': '0.200', 'WEAPON5': '0.450', 'DAMAGECOUNT': '0.675', 'WEAPON3': '0.800', 'FRAGCOUNT': '1.000', 'weapon2': '1.536', 'weapon3': '1.844'} [2024-08-05 07:09:05,483][00034] Fps is (10 sec: 819.2, 60 sec: 1228.8, 300 sec: 1138.5). Total num frames: 5881856. Throughput: 0: 288.6. Samples: 1470920. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 07:09:05,484][00034] Avg episode reward: [(0, '-3.463')] [2024-08-05 07:09:10,483][00034] Fps is (10 sec: 1638.4, 60 sec: 1228.8, 300 sec: 1166.3). Total num frames: 5890048. Throughput: 0: 287.4. Samples: 1472635. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 07:09:10,484][00034] Avg episode reward: [(0, '-3.463')] [2024-08-05 07:09:15,176][00139] DAMAGECOUNT value on done: 53118.0 [2024-08-05 07:09:15,176][00139] Sum rewards: -1.689, reward structure: {'DEATHCOUNT': '-6.000', 'HEALTH': '-1.282', 'FRAGCOUNT': '-1.000', 'AMMO2': '0.007', 'AMMO5': '0.009', 'WEAPON1': '0.020', 'AMMO4': '0.034', 'AMMO3': '0.064', 'weapon7': '0.072', 'AMMO6': '0.100', 'AMMO7': '0.100', 'WEAPON7': '0.100', 'weapon5': '0.130', 'HITCOUNT': '0.200', 'WEAPON5': '0.250', 'WEAPON4': '0.300', 'WEAPON3': '0.550', 'ARMOR': '0.590', 'weapon4': '0.758', 'DAMAGECOUNT': '0.855', 'weapon2': '1.162', 'weapon3': '1.292'} [2024-08-05 07:09:15,405][00139] DAMAGECOUNT value on done: 55924.0 [2024-08-05 07:09:15,405][00139] Sum rewards: -1.258, reward structure: {'DEATHCOUNT': '-8.250', 'HEALTH': '-0.932', 'AMMO5': '0.007', 'WEAPON1': '0.010', 'AMMO2': '0.021', 'ARMOR': '0.048', 'WEAPON5': '0.100', 'weapon5': '0.102', 'AMMO4': '0.105', 'AMMO3': '0.118', 'HITCOUNT': '0.180', 'WEAPON4': '0.300', 'weapon4': '0.332', 'WEAPON3': '0.750', 'DAMAGECOUNT': '0.774', 'weapon2': '1.356', 'weapon3': '1.720', 'FRAGCOUNT': '2.000'} [2024-08-05 07:09:15,483][00034] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1138.5). Total num frames: 5890048. Throughput: 0: 288.4. Samples: 1473532. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 07:09:15,484][00034] Avg episode reward: [(0, '-3.404')] [2024-08-05 07:09:16,813][00138] Updated weights for policy 0, policy_version 720 (0.0017) [2024-08-05 07:09:20,483][00034] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1138.5). Total num frames: 5898240. Throughput: 0: 288.1. Samples: 1475230. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 07:09:20,484][00034] Avg episode reward: [(0, '-3.404')] [2024-08-05 07:09:25,483][00034] Fps is (10 sec: 1638.4, 60 sec: 1228.8, 300 sec: 1166.3). Total num frames: 5906432. Throughput: 0: 288.3. Samples: 1476966. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 07:09:25,484][00034] Avg episode reward: [(0, '-3.404')] [2024-08-05 07:09:29,689][00139] DAMAGECOUNT value on done: 53915.0 [2024-08-05 07:09:29,690][00139] Sum rewards: -1.175, reward structure: {'DEATHCOUNT': '-11.250', 'HEALTH': '-2.664', 'AMMO5': '0.010', 'AMMO2': '0.019', 'weapon5': '0.062', 'AMMO4': '0.096', 'WEAPON5': '0.150', 'AMMO3': '0.191', 'WEAPON4': '0.350', 'HITCOUNT': '0.470', 'weapon4': '0.508', 'WEAPON3': '1.100', 'weapon2': '1.372', 'weapon3': '1.520', 'DAMAGECOUNT': '2.391', 'FRAGCOUNT': '4.500'} [2024-08-05 07:09:29,917][00139] DAMAGECOUNT value on done: 56214.0 [2024-08-05 07:09:29,918][00139] Sum rewards: 3.792, reward structure: {'DEATHCOUNT': '-3.750', 'HEALTH': '-0.632', 'weapon5': '0.002', 'AMMO5': '0.005', 'AMMO2': '0.009', 'AMMO4': '0.047', 'AMMO3': '0.064', 'WEAPON5': '0.100', 'HITCOUNT': '0.250', 'WEAPON4': '0.250', 'weapon4': '0.328', 'WEAPON3': '0.400', 'ARMOR': '0.503', 'weapon2': '0.688', 'DAMAGECOUNT': '0.870', 'weapon3': '1.658', 'FRAGCOUNT': '3.000'} [2024-08-05 07:09:30,483][00034] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1138.5). Total num frames: 5906432. Throughput: 0: 289.3. Samples: 1477880. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 07:09:30,484][00034] Avg episode reward: [(0, '-3.300')] [2024-08-05 07:09:30,486][00132] Saving new best policy, reward=-3.300! [2024-08-05 07:09:35,483][00034] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1138.5). Total num frames: 5914624. Throughput: 0: 288.3. Samples: 1479560. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 07:09:35,484][00034] Avg episode reward: [(0, '-3.300')] [2024-08-05 07:09:35,493][00132] Saving /kaggle/working/train_dir/default_experiment/checkpoint_p0/checkpoint_000000722_5914624.pth... [2024-08-05 07:09:35,567][00132] Removing /kaggle/working/train_dir/default_experiment/checkpoint_p0/checkpoint_000000689_5644288.pth [2024-08-05 07:09:40,483][00034] Fps is (10 sec: 1638.4, 60 sec: 1228.8, 300 sec: 1138.5). Total num frames: 5922816. Throughput: 0: 286.4. Samples: 1481263. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 07:09:40,484][00034] Avg episode reward: [(0, '-3.300')] [2024-08-05 07:09:44,341][00139] DAMAGECOUNT value on done: 54366.0 [2024-08-05 07:09:44,342][00139] Sum rewards: -0.849, reward structure: {'DEATHCOUNT': '-9.000', 'HEALTH': '-0.940', 'AMMO5': '0.015', 'AMMO2': '0.020', 'weapon4': '0.076', 'AMMO4': '0.097', 'WEAPON4': '0.100', 'weapon5': '0.122', 'AMMO3': '0.126', 'HITCOUNT': '0.220', 'WEAPON5': '0.300', 'ARMOR': '0.476', 'WEAPON3': '0.800', 'DAMAGECOUNT': '1.353', 'weapon2': '1.620', 'weapon3': '1.766', 'FRAGCOUNT': '2.000'} [2024-08-05 07:09:44,565][00139] DAMAGECOUNT value on done: 56534.0 [2024-08-05 07:09:44,565][00139] Sum rewards: -3.558, reward structure: {'DEATHCOUNT': '-11.250', 'HEALTH': '-0.048', 'AMMO5': '0.012', 'AMMO2': '0.032', 'weapon5': '0.056', 'ARMOR': '0.064', 'weapon4': '0.114', 'AMMO3': '0.140', 'AMMO4': '0.161', 'WEAPON5': '0.250', 'HITCOUNT': '0.280', 'WEAPON4': '0.300', 'WEAPON3': '0.650', 'DAMAGECOUNT': '0.960', 'weapon2': '1.458', 'FRAGCOUNT': '1.500', 'weapon3': '1.762'} [2024-08-05 07:09:45,483][00034] Fps is (10 sec: 1638.4, 60 sec: 1228.8, 300 sec: 1166.3). Total num frames: 5931008. Throughput: 0: 287.5. Samples: 1482183. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 07:09:45,485][00034] Avg episode reward: [(0, '-3.353')] [2024-08-05 07:09:50,483][00034] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1138.5). Total num frames: 5931008. Throughput: 0: 288.9. Samples: 1483920. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 07:09:50,484][00034] Avg episode reward: [(0, '-3.353')] [2024-08-05 07:09:55,483][00034] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1138.5). Total num frames: 5939200. Throughput: 0: 289.1. Samples: 1485644. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 07:09:55,484][00034] Avg episode reward: [(0, '-3.353')] [2024-08-05 07:09:58,715][00139] Large shaping reward -2.504 for [('FRAGCOUNT', -1.5, -1.0), ('DEATHCOUNT', -0.75, 1.0), ('HEALTH', -0.255, -85.0), ('AMMO5', -0.0005, -1.0), ('weapon5', 0.002)] [2024-08-05 07:09:58,994][00139] DAMAGECOUNT value on done: 54650.0 [2024-08-05 07:09:58,995][00139] Sum rewards: -8.335, reward structure: {'DEATHCOUNT': '-10.500', 'HEALTH': '-2.314', 'FRAGCOUNT': '-2.000', 'AMMO2': '0.002', 'AMMO4': '0.011', 'ARMOR': '0.032', 'AMMO5': '0.038', 'AMMO3': '0.148', 'WEAPON4': '0.150', 'weapon4': '0.150', 'HITCOUNT': '0.230', 'weapon5': '0.236', 'WEAPON5': '0.550', 'WEAPON3': '0.800', 'DAMAGECOUNT': '0.852', 'weapon2': '1.272', 'weapon3': '2.008'} [2024-08-05 07:09:59,212][00139] DAMAGECOUNT value on done: 56739.0 [2024-08-05 07:09:59,212][00139] Sum rewards: -2.838, reward structure: {'DEATHCOUNT': '-9.750', 'HEALTH': '-0.848', 'AMMO2': '0.006', 'AMMO5': '0.007', 'AMMO4': '0.031', 'weapon4': '0.038', 'WEAPON4': '0.050', 'WEAPON5': '0.050', 'AMMO3': '0.149', 'HITCOUNT': '0.190', 'DAMAGECOUNT': '0.615', 'WEAPON3': '0.850', 'FRAGCOUNT': '1.000', 'ARMOR': '1.349', 'weapon3': '1.708', 'weapon2': '1.716'} [2024-08-05 07:10:00,483][00034] Fps is (10 sec: 1638.4, 60 sec: 1228.8, 300 sec: 1166.3). Total num frames: 5947392. Throughput: 0: 288.2. Samples: 1486502. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 07:10:00,484][00034] Avg episode reward: [(0, '-3.410')] [2024-08-05 07:10:05,483][00034] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1138.5). Total num frames: 5947392. Throughput: 0: 287.3. Samples: 1488160. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 07:10:05,485][00034] Avg episode reward: [(0, '-3.410')] [2024-08-05 07:10:10,483][00034] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1138.5). Total num frames: 5955584. Throughput: 0: 286.5. Samples: 1489858. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 07:10:10,485][00034] Avg episode reward: [(0, '-3.410')] [2024-08-05 07:10:13,807][00139] DAMAGECOUNT value on done: 54798.0 [2024-08-05 07:10:13,808][00139] Sum rewards: -2.327, reward structure: {'DEATHCOUNT': '-9.750', 'HEALTH': '-1.328', 'AMMO2': '0.011', 'AMMO5': '0.012', 'weapon5': '0.054', 'AMMO4': '0.055', 'ARMOR': '0.068', 'WEAPON4': '0.100', 'AMMO3': '0.128', 'HITCOUNT': '0.150', 'weapon4': '0.286', 'WEAPON5': '0.300', 'DAMAGECOUNT': '0.444', 'WEAPON3': '0.850', 'weapon2': '1.390', 'weapon3': '1.902', 'FRAGCOUNT': '3.000'} [2024-08-05 07:10:14,105][00139] DAMAGECOUNT value on done: 56989.0 [2024-08-05 07:10:14,106][00139] Sum rewards: -5.333, reward structure: {'DEATHCOUNT': '-12.000', 'HEALTH': '-0.676', 'weapon5': '0.006', 'AMMO5': '0.007', 'AMMO2': '0.029', 'WEAPON5': '0.100', 'AMMO3': '0.124', 'AMMO4': '0.146', 'WEAPON4': '0.150', 'HITCOUNT': '0.210', 'weapon4': '0.230', 'ARMOR': '0.400', 'WEAPON3': '0.600', 'DAMAGECOUNT': '0.750', 'FRAGCOUNT': '1.000', 'weapon3': '1.744', 'weapon2': '1.846'} [2024-08-05 07:10:15,483][00034] Fps is (10 sec: 1638.4, 60 sec: 1228.8, 300 sec: 1138.5). Total num frames: 5963776. Throughput: 0: 285.7. Samples: 1490738. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 07:10:15,485][00034] Avg episode reward: [(0, '-3.437')] [2024-08-05 07:10:20,483][00034] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1138.5). Total num frames: 5963776. Throughput: 0: 286.1. Samples: 1492436. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 07:10:20,485][00034] Avg episode reward: [(0, '-3.437')] [2024-08-05 07:10:25,483][00034] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1138.5). Total num frames: 5971968. Throughput: 0: 285.6. Samples: 1494117. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 07:10:25,485][00034] Avg episode reward: [(0, '-3.437')] [2024-08-05 07:10:28,335][00138] Updated weights for policy 0, policy_version 730 (0.0017) [2024-08-05 07:10:28,652][00139] DAMAGECOUNT value on done: 55313.0 [2024-08-05 07:10:28,652][00139] Sum rewards: -0.536, reward structure: {'DEATHCOUNT': '-9.000', 'HEALTH': '-1.804', 'AMMO4': '-0.019', 'AMMO2': '-0.004', 'AMMO5': '0.009', 'WEAPON5': '0.150', 'AMMO3': '0.156', 'weapon5': '0.200', 'HITCOUNT': '0.330', 'WEAPON3': '1.000', 'DAMAGECOUNT': '1.545', 'weapon3': '1.612', 'weapon2': '1.788', 'FRAGCOUNT': '3.500'} [2024-08-05 07:10:28,912][00139] DAMAGECOUNT value on done: 57159.0 [2024-08-05 07:10:30,483][00034] Fps is (10 sec: 1638.4, 60 sec: 1228.8, 300 sec: 1138.5). Total num frames: 5980160. Throughput: 0: 285.0. Samples: 1495006. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 07:10:30,485][00034] Avg episode reward: [(0, '-3.337')] [2024-08-05 07:10:35,483][00034] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1138.5). Total num frames: 5980160. Throughput: 0: 284.1. Samples: 1496704. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 07:10:35,485][00034] Avg episode reward: [(0, '-3.337')] [2024-08-05 07:10:40,483][00034] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1138.5). Total num frames: 5988352. Throughput: 0: 281.8. Samples: 1498327. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 07:10:40,484][00034] Avg episode reward: [(0, '-3.337')] [2024-08-05 07:10:43,737][00139] DAMAGECOUNT value on done: 55618.0 [2024-08-05 07:10:43,738][00139] Sum rewards: -3.502, reward structure: {'DEATHCOUNT': '-12.000', 'AMMO5': '0.015', 'ARMOR': '0.016', 'AMMO2': '0.027', 'weapon5': '0.058', 'weapon4': '0.094', 'HEALTH': '0.108', 'AMMO4': '0.137', 'WEAPON4': '0.150', 'AMMO3': '0.152', 'HITCOUNT': '0.220', 'WEAPON5': '0.250', 'WEAPON3': '0.700', 'DAMAGECOUNT': '0.915', 'weapon2': '1.690', 'weapon3': '1.966', 'FRAGCOUNT': '2.000'} [2024-08-05 07:10:43,982][00139] DAMAGECOUNT value on done: 57361.0 [2024-08-05 07:10:43,982][00139] Sum rewards: -1.427, reward structure: {'DEATHCOUNT': '-8.250', 'AMMO5': '0.010', 'AMMO2': '0.016', 'weapon5': '0.018', 'WEAPON1': '0.020', 'weapon4': '0.042', 'ARMOR': '0.044', 'WEAPON4': '0.050', 'AMMO4': '0.080', 'HEALTH': '0.093', 'AMMO3': '0.099', 'WEAPON5': '0.200', 'HITCOUNT': '0.210', 'WEAPON3': '0.600', 'DAMAGECOUNT': '0.606', 'FRAGCOUNT': '1.000', 'weapon2': '1.742', 'weapon3': '1.992'} [2024-08-05 07:10:45,483][00034] Fps is (10 sec: 1638.4, 60 sec: 1092.3, 300 sec: 1138.5). Total num frames: 5996544. Throughput: 0: 281.4. Samples: 1499166. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 07:10:45,484][00034] Avg episode reward: [(0, '-3.283')] [2024-08-05 07:10:45,491][00132] Saving new best policy, reward=-3.283! [2024-08-05 07:10:50,483][00034] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1138.5). Total num frames: 5996544. Throughput: 0: 280.9. Samples: 1500802. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 07:10:50,484][00034] Avg episode reward: [(0, '-3.283')] [2024-08-05 07:10:55,483][00034] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1138.5). Total num frames: 6004736. Throughput: 0: 280.9. Samples: 1502500. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 07:10:55,485][00034] Avg episode reward: [(0, '-3.283')] [2024-08-05 07:10:58,751][00139] DAMAGECOUNT value on done: 55738.0 [2024-08-05 07:10:58,752][00139] Sum rewards: -2.483, reward structure: {'DEATHCOUNT': '-9.000', 'AMMO5': '0.005', 'AMMO2': '0.015', 'WEAPON5': '0.050', 'HEALTH': '0.050', 'AMMO4': '0.075', 'weapon5': '0.076', 'HITCOUNT': '0.090', 'WEAPON4': '0.100', 'ARMOR': '0.108', 'AMMO3': '0.131', 'weapon4': '0.146', 'DAMAGECOUNT': '0.360', 'WEAPON3': '0.700', 'FRAGCOUNT': '1.000', 'weapon2': '1.394', 'weapon3': '2.216'} [2024-08-05 07:10:59,015][00139] DAMAGECOUNT value on done: 57577.0 [2024-08-05 07:10:59,016][00139] Sum rewards: -4.381, reward structure: {'DEATHCOUNT': '-9.750', 'HEALTH': '-1.970', 'AMMO5': '0.003', 'AMMO2': '0.026', 'weapon5': '0.026', 'WEAPON5': '0.050', 'AMMO4': '0.128', 'AMMO3': '0.130', 'WEAPON4': '0.150', 'HITCOUNT': '0.190', 'weapon4': '0.286', 'ARMOR': '0.456', 'DAMAGECOUNT': '0.648', 'WEAPON3': '0.900', 'FRAGCOUNT': '1.000', 'weapon2': '1.184', 'weapon3': '2.162'} [2024-08-05 07:11:00,483][00034] Fps is (10 sec: 1638.4, 60 sec: 1092.3, 300 sec: 1138.5). Total num frames: 6012928. Throughput: 0: 280.2. Samples: 1503347. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 07:11:00,484][00034] Avg episode reward: [(0, '-3.294')] [2024-08-05 07:11:05,483][00034] Fps is (10 sec: 1638.4, 60 sec: 1228.8, 300 sec: 1138.5). Total num frames: 6021120. Throughput: 0: 280.6. Samples: 1505064. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 07:11:05,484][00034] Avg episode reward: [(0, '-3.294')] [2024-08-05 07:11:10,483][00034] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1138.5). Total num frames: 6021120. Throughput: 0: 280.0. Samples: 1506717. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 07:11:10,486][00034] Avg episode reward: [(0, '-3.294')] [2024-08-05 07:11:13,775][00139] DAMAGECOUNT value on done: 56083.0 [2024-08-05 07:11:13,776][00139] Sum rewards: -1.330, reward structure: {'DEATHCOUNT': '-7.500', 'HEALTH': '-1.320', 'AMMO4': '-0.048', 'AMMO2': '-0.009', 'AMMO5': '0.012', 'ARMOR': '0.028', 'weapon5': '0.030', 'AMMO3': '0.098', 'WEAPON4': '0.100', 'weapon4': '0.234', 'WEAPON5': '0.250', 'HITCOUNT': '0.310', 'WEAPON3': '0.750', 'DAMAGECOUNT': '1.035', 'weapon2': '1.120', 'weapon3': '1.580', 'FRAGCOUNT': '2.000'} [2024-08-05 07:11:14,005][00139] DAMAGECOUNT value on done: 58482.0 [2024-08-05 07:11:14,005][00139] Sum rewards: 3.494, reward structure: {'DEATHCOUNT': '-8.250', 'HEALTH': '-2.122', 'AMMO4': '-0.051', 'AMMO2': '-0.010', 'AMMO5': '0.007', 'weapon5': '0.012', 'ARMOR': '0.082', 'AMMO3': '0.086', 'WEAPON5': '0.150', 'HITCOUNT': '0.600', 'WEAPON3': '0.650', 'weapon3': '1.672', 'weapon2': '1.952', 'DAMAGECOUNT': '2.715', 'FRAGCOUNT': '6.000'} [2024-08-05 07:11:15,483][00034] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1138.5). Total num frames: 6029312. Throughput: 0: 278.9. Samples: 1507556. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 07:11:15,485][00034] Avg episode reward: [(0, '-3.220')] [2024-08-05 07:11:15,494][00132] Saving new best policy, reward=-3.220! [2024-08-05 07:11:20,483][00034] Fps is (10 sec: 1638.4, 60 sec: 1228.8, 300 sec: 1138.5). Total num frames: 6037504. Throughput: 0: 278.0. Samples: 1509215. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 07:11:20,484][00034] Avg episode reward: [(0, '-3.220')] [2024-08-05 07:11:25,483][00034] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1138.5). Total num frames: 6037504. Throughput: 0: 278.3. Samples: 1510852. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 07:11:25,485][00034] Avg episode reward: [(0, '-3.220')] [2024-08-05 07:11:29,080][00139] DAMAGECOUNT value on done: 56352.0 [2024-08-05 07:11:29,080][00139] Sum rewards: -1.013, reward structure: {'DEATHCOUNT': '-9.000', 'HEALTH': '-1.040', 'AMMO2': '0.008', 'AMMO5': '0.019', 'WEAPON1': '0.020', 'AMMO4': '0.041', 'weapon5': '0.102', 'AMMO3': '0.139', 'HITCOUNT': '0.220', 'WEAPON5': '0.300', 'DAMAGECOUNT': '0.807', 'WEAPON3': '0.850', 'weapon3': '1.546', 'weapon2': '1.974', 'FRAGCOUNT': '3.000'} [2024-08-05 07:11:29,311][00139] DAMAGECOUNT value on done: 58877.0 [2024-08-05 07:11:29,312][00139] Sum rewards: -2.199, reward structure: {'DEATHCOUNT': '-9.000', 'WEAPON1': '0.010', 'AMMO5': '0.015', 'AMMO2': '0.018', 'weapon5': '0.028', 'WEAPON4': '0.050', 'AMMO3': '0.082', 'AMMO4': '0.089', 'WEAPON5': '0.100', 'ARMOR': '0.112', 'HEALTH': '0.160', 'weapon4': '0.208', 'HITCOUNT': '0.390', 'FRAGCOUNT': '0.500', 'WEAPON3': '0.500', 'DAMAGECOUNT': '1.185', 'weapon2': '1.658', 'weapon3': '1.696'} [2024-08-05 07:11:30,483][00034] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1138.5). Total num frames: 6045696. Throughput: 0: 278.1. Samples: 1511681. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 07:11:30,484][00034] Avg episode reward: [(0, '-3.186')] [2024-08-05 07:11:30,487][00132] Saving new best policy, reward=-3.186! [2024-08-05 07:11:35,483][00034] Fps is (10 sec: 1638.4, 60 sec: 1228.8, 300 sec: 1138.5). Total num frames: 6053888. Throughput: 0: 279.6. Samples: 1513384. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 07:11:35,485][00034] Avg episode reward: [(0, '-3.186')] [2024-08-05 07:11:35,493][00132] Saving /kaggle/working/train_dir/default_experiment/checkpoint_p0/checkpoint_000000739_6053888.pth... [2024-08-05 07:11:35,567][00132] Removing /kaggle/working/train_dir/default_experiment/checkpoint_p0/checkpoint_000000705_5775360.pth [2024-08-05 07:11:40,483][00034] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1138.5). Total num frames: 6053888. Throughput: 0: 277.7. Samples: 1514996. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 07:11:40,484][00034] Avg episode reward: [(0, '-3.186')] [2024-08-05 07:11:42,034][00138] Updated weights for policy 0, policy_version 740 (0.0018) [2024-08-05 07:11:44,051][00139] DAMAGECOUNT value on done: 56755.0 [2024-08-05 07:11:44,052][00139] Sum rewards: -2.383, reward structure: {'DEATHCOUNT': '-11.250', 'HEALTH': '-1.712', 'AMMO4': '-0.021', 'AMMO2': '-0.004', 'AMMO5': '0.041', 'weapon4': '0.042', 'ARMOR': '0.056', 'WEAPON4': '0.100', 'AMMO3': '0.120', 'weapon5': '0.186', 'AMMO6': '0.200', 'WEAPON7': '0.200', 'AMMO7': '0.200', 'HITCOUNT': '0.280', 'WEAPON5': '0.650', 'WEAPON3': '0.800', 'DAMAGECOUNT': '1.209', 'weapon2': '1.700', 'weapon3': '1.820', 'FRAGCOUNT': '3.000'} [2024-08-05 07:11:44,269][00139] DAMAGECOUNT value on done: 59517.0 [2024-08-05 07:11:44,269][00139] Sum rewards: -1.554, reward structure: {'DEATHCOUNT': '-11.250', 'HEALTH': '-2.446', 'AMMO4': '-0.042', 'AMMO2': '-0.008', 'AMMO5': '0.014', 'weapon5': '0.132', 'AMMO3': '0.212', 'WEAPON5': '0.250', 'HITCOUNT': '0.330', 'ARMOR': '0.428', 'WEAPON3': '1.250', 'weapon2': '1.490', 'DAMAGECOUNT': '1.920', 'weapon3': '2.166', 'FRAGCOUNT': '4.000'} [2024-08-05 07:11:45,483][00034] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1138.5). Total num frames: 6062080. Throughput: 0: 277.9. Samples: 1515854. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 07:11:45,485][00034] Avg episode reward: [(0, '-3.268')] [2024-08-05 07:11:50,483][00034] Fps is (10 sec: 1638.4, 60 sec: 1228.8, 300 sec: 1138.5). Total num frames: 6070272. Throughput: 0: 277.6. Samples: 1517558. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 07:11:50,485][00034] Avg episode reward: [(0, '-3.268')] [2024-08-05 07:11:55,483][00034] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1138.5). Total num frames: 6070272. Throughput: 0: 278.2. Samples: 1519234. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 07:11:55,484][00034] Avg episode reward: [(0, '-3.268')] [2024-08-05 07:11:59,051][00139] DAMAGECOUNT value on done: 57242.0 [2024-08-05 07:11:59,052][00139] Sum rewards: 2.060, reward structure: {'DEATHCOUNT': '-7.500', 'HEALTH': '-1.570', 'AMMO2': '0.014', 'AMMO5': '0.027', 'ARMOR': '0.040', 'weapon7': '0.048', 'AMMO4': '0.071', 'AMMO6': '0.100', 'AMMO7': '0.100', 'WEAPON4': '0.100', 'WEAPON7': '0.100', 'AMMO3': '0.105', 'weapon4': '0.118', 'weapon5': '0.202', 'HITCOUNT': '0.240', 'WEAPON5': '0.450', 'WEAPON3': '0.850', 'weapon2': '1.336', 'DAMAGECOUNT': '1.461', 'weapon3': '1.768', 'FRAGCOUNT': '4.000'} [2024-08-05 07:11:59,291][00139] DAMAGECOUNT value on done: 59682.0 [2024-08-05 07:11:59,292][00139] Sum rewards: -3.599, reward structure: {'DEATHCOUNT': '-9.000', 'HEALTH': '-0.912', 'AMMO2': '0.003', 'AMMO4': '0.016', 'ARMOR': '0.052', 'AMMO3': '0.131', 'HITCOUNT': '0.180', 'DAMAGECOUNT': '0.495', 'WEAPON3': '0.750', 'FRAGCOUNT': '1.000', 'weapon2': '1.730', 'weapon3': '1.956'} [2024-08-05 07:12:00,483][00034] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1138.5). Total num frames: 6078464. Throughput: 0: 278.1. Samples: 1520072. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 07:12:00,485][00034] Avg episode reward: [(0, '-3.236')] [2024-08-05 07:12:05,483][00034] Fps is (10 sec: 1638.4, 60 sec: 1092.3, 300 sec: 1138.5). Total num frames: 6086656. Throughput: 0: 279.5. Samples: 1521792. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 07:12:05,484][00034] Avg episode reward: [(0, '-3.236')] [2024-08-05 07:12:10,483][00034] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1110.8). Total num frames: 6086656. Throughput: 0: 279.5. Samples: 1523429. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 07:12:10,485][00034] Avg episode reward: [(0, '-3.236')] [2024-08-05 07:12:14,094][00139] DAMAGECOUNT value on done: 57686.0 [2024-08-05 07:12:14,095][00139] Sum rewards: 4.528, reward structure: {'DEATHCOUNT': '-6.750', 'HEALTH': '-1.628', 'AMMO4': '-0.023', 'AMMO2': '-0.005', 'AMMO5': '0.007', 'weapon7': '0.052', 'weapon5': '0.054', 'WEAPON5': '0.100', 'AMMO3': '0.131', 'HITCOUNT': '0.180', 'AMMO6': '0.320', 'AMMO7': '0.320', 'WEAPON7': '0.400', 'ARMOR': '0.551', 'WEAPON3': '0.850', 'DAMAGECOUNT': '1.332', 'weapon2': '1.682', 'weapon3': '1.954', 'FRAGCOUNT': '5.000'} [2024-08-05 07:12:14,330][00139] DAMAGECOUNT value on done: 60167.0 [2024-08-05 07:12:14,330][00139] Sum rewards: -0.238, reward structure: {'DEATHCOUNT': '-7.500', 'HEALTH': '-1.738', 'AMMO2': '0.001', 'AMMO4': '0.005', 'AMMO5': '0.015', 'ARMOR': '0.036', 'AMMO3': '0.090', 'HITCOUNT': '0.150', 'weapon5': '0.240', 'WEAPON5': '0.450', 'WEAPON3': '0.650', 'weapon3': '1.440', 'DAMAGECOUNT': '1.455', 'weapon2': '1.968', 'FRAGCOUNT': '2.500'} [2024-08-05 07:12:15,483][00034] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1138.5). Total num frames: 6094848. Throughput: 0: 279.6. Samples: 1524262. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 07:12:15,484][00034] Avg episode reward: [(0, '-3.204')] [2024-08-05 07:12:20,483][00034] Fps is (10 sec: 1638.4, 60 sec: 1092.3, 300 sec: 1138.5). Total num frames: 6103040. Throughput: 0: 279.7. Samples: 1525970. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 07:12:20,484][00034] Avg episode reward: [(0, '-3.204')] [2024-08-05 07:12:25,483][00034] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1110.8). Total num frames: 6103040. Throughput: 0: 282.2. Samples: 1527695. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 07:12:25,485][00034] Avg episode reward: [(0, '-3.204')] [2024-08-05 07:12:28,851][00139] DAMAGECOUNT value on done: 58182.0 [2024-08-05 07:12:28,852][00139] Sum rewards: 3.488, reward structure: {'DEATHCOUNT': '-7.500', 'HEALTH': '-0.560', 'AMMO5': '0.005', 'AMMO2': '0.007', 'weapon5': '0.008', 'AMMO4': '0.035', 'WEAPON5': '0.050', 'weapon7': '0.066', 'AMMO3': '0.090', 'WEAPON4': '0.100', 'AMMO6': '0.120', 'AMMO7': '0.120', 'WEAPON7': '0.200', 'weapon4': '0.216', 'HITCOUNT': '0.300', 'WEAPON3': '0.550', 'ARMOR': '0.927', 'DAMAGECOUNT': '1.488', 'weapon3': '1.508', 'weapon2': '1.758', 'FRAGCOUNT': '4.000'} [2024-08-05 07:12:29,096][00139] DAMAGECOUNT value on done: 60991.0 [2024-08-05 07:12:29,097][00139] Sum rewards: -5.391, reward structure: {'DEATHCOUNT': '-11.250', 'HEALTH': '-3.390', 'AMMO4': '-0.081', 'AMMO2': '-0.016', 'AMMO5': '0.015', 'weapon5': '0.156', 'AMMO3': '0.179', 'WEAPON5': '0.250', 'ARMOR': '0.469', 'FRAGCOUNT': '0.500', 'HITCOUNT': '0.510', 'WEAPON3': '1.200', 'weapon3': '1.634', 'weapon2': '1.960', 'DAMAGECOUNT': '2.472'} [2024-08-05 07:12:30,483][00034] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1138.5). Total num frames: 6111232. Throughput: 0: 281.9. Samples: 1528541. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 07:12:30,484][00034] Avg episode reward: [(0, '-3.152')] [2024-08-05 07:12:30,486][00132] Saving new best policy, reward=-3.152! [2024-08-05 07:12:35,483][00034] Fps is (10 sec: 1638.4, 60 sec: 1092.3, 300 sec: 1138.5). Total num frames: 6119424. Throughput: 0: 282.1. Samples: 1530253. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 07:12:35,484][00034] Avg episode reward: [(0, '-3.152')] [2024-08-05 07:12:40,483][00034] Fps is (10 sec: 1638.4, 60 sec: 1228.8, 300 sec: 1138.5). Total num frames: 6127616. Throughput: 0: 282.2. Samples: 1531931. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 07:12:40,484][00034] Avg episode reward: [(0, '-3.152')] [2024-08-05 07:12:43,890][00139] DAMAGECOUNT value on done: 58435.0 [2024-08-05 07:12:44,139][00139] DAMAGECOUNT value on done: 61220.0 [2024-08-05 07:12:44,139][00139] Sum rewards: -1.063, reward structure: {'DEATHCOUNT': '-8.250', 'HEALTH': '-0.587', 'AMMO5': '0.005', 'AMMO2': '0.018', 'WEAPON4': '0.050', 'weapon5': '0.080', 'AMMO4': '0.090', 'AMMO3': '0.093', 'weapon4': '0.100', 'WEAPON5': '0.150', 'HITCOUNT': '0.210', 'WEAPON3': '0.500', 'DAMAGECOUNT': '0.687', 'weapon3': '1.124', 'weapon2': '1.666', 'FRAGCOUNT': '3.000'} [2024-08-05 07:12:45,483][00034] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1138.5). Total num frames: 6127616. Throughput: 0: 280.9. Samples: 1532712. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 07:12:45,484][00034] Avg episode reward: [(0, '-3.157')] [2024-08-05 07:12:50,483][00034] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1138.5). Total num frames: 6135808. Throughput: 0: 281.6. Samples: 1534465. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 07:12:50,484][00034] Avg episode reward: [(0, '-3.157')] [2024-08-05 07:12:54,452][00138] Updated weights for policy 0, policy_version 750 (0.0017) [2024-08-05 07:12:55,483][00034] Fps is (10 sec: 1638.4, 60 sec: 1228.8, 300 sec: 1138.5). Total num frames: 6144000. Throughput: 0: 283.8. Samples: 1536202. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 07:12:55,485][00034] Avg episode reward: [(0, '-3.157')] [2024-08-05 07:12:58,432][00139] DAMAGECOUNT value on done: 58920.0 [2024-08-05 07:12:58,432][00139] Sum rewards: -0.263, reward structure: {'DEATHCOUNT': '-11.250', 'HEALTH': '-1.204', 'weapon5': '0.012', 'AMMO5': '0.013', 'AMMO2': '0.019', 'WEAPON5': '0.050', 'weapon4': '0.074', 'ARMOR': '0.086', 'AMMO4': '0.095', 'WEAPON4': '0.150', 'AMMO3': '0.168', 'HITCOUNT': '0.330', 'WEAPON3': '1.050', 'DAMAGECOUNT': '1.455', 'weapon2': '1.588', 'weapon3': '2.102', 'FRAGCOUNT': '5.000'} [2024-08-05 07:12:58,647][00139] DAMAGECOUNT value on done: 61484.0 [2024-08-05 07:13:00,483][00034] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1138.5). Total num frames: 6144000. Throughput: 0: 284.4. Samples: 1537059. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 07:13:00,485][00034] Avg episode reward: [(0, '-3.132')] [2024-08-05 07:13:00,486][00132] Saving new best policy, reward=-3.132! [2024-08-05 07:13:05,483][00034] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1138.5). Total num frames: 6152192. Throughput: 0: 283.9. Samples: 1538747. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 07:13:05,485][00034] Avg episode reward: [(0, '-3.132')] [2024-08-05 07:13:10,483][00034] Fps is (10 sec: 1638.4, 60 sec: 1228.8, 300 sec: 1138.5). Total num frames: 6160384. Throughput: 0: 283.0. Samples: 1540430. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 07:13:10,485][00034] Avg episode reward: [(0, '-3.132')] [2024-08-05 07:13:13,470][00139] DAMAGECOUNT value on done: 59184.0 [2024-08-05 07:13:13,471][00139] Sum rewards: -3.363, reward structure: {'DEATHCOUNT': '-9.000', 'HEALTH': '-1.068', 'AMMO2': '0.002', 'AMMO4': '0.009', 'AMMO5': '0.014', 'ARMOR': '0.056', 'weapon5': '0.070', 'HITCOUNT': '0.100', 'AMMO3': '0.139', 'WEAPON5': '0.200', 'WEAPON3': '0.750', 'DAMAGECOUNT': '0.792', 'FRAGCOUNT': '1.000', 'weapon2': '1.662', 'weapon3': '1.910'} [2024-08-05 07:13:13,709][00139] DAMAGECOUNT value on done: 61995.0 [2024-08-05 07:13:13,710][00139] Sum rewards: 1.530, reward structure: {'DEATHCOUNT': '-9.750', 'HEALTH': '-1.366', 'AMMO2': '0.015', 'AMMO5': '0.018', 'ARMOR': '0.044', 'WEAPON4': '0.050', 'weapon7': '0.064', 'AMMO4': '0.073', 'weapon4': '0.080', 'AMMO6': '0.120', 'AMMO7': '0.120', 'weapon5': '0.128', 'AMMO3': '0.157', 'WEAPON7': '0.200', 'HITCOUNT': '0.310', 'WEAPON5': '0.400', 'WEAPON3': '0.900', 'weapon2': '1.296', 'DAMAGECOUNT': '1.533', 'weapon3': '2.138', 'FRAGCOUNT': '5.000'} [2024-08-05 07:13:15,483][00034] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1110.8). Total num frames: 6160384. Throughput: 0: 282.4. Samples: 1541249. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 07:13:15,485][00034] Avg episode reward: [(0, '-3.056')] [2024-08-05 07:13:15,491][00132] Saving new best policy, reward=-3.056! [2024-08-05 07:13:20,483][00034] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1138.5). Total num frames: 6168576. Throughput: 0: 282.7. Samples: 1542973. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 07:13:20,484][00034] Avg episode reward: [(0, '-3.056')] [2024-08-05 07:13:25,483][00034] Fps is (10 sec: 1638.4, 60 sec: 1228.8, 300 sec: 1138.5). Total num frames: 6176768. Throughput: 0: 283.0. Samples: 1544667. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 07:13:25,484][00034] Avg episode reward: [(0, '-3.056')] [2024-08-05 07:13:28,211][00139] DAMAGECOUNT value on done: 59632.0 [2024-08-05 07:13:28,212][00139] Sum rewards: -6.001, reward structure: {'DEATHCOUNT': '-12.000', 'HEALTH': '-1.368', 'AMMO2': '0.004', 'AMMO4': '0.018', 'AMMO5': '0.020', 'weapon5': '0.092', 'AMMO3': '0.133', 'WEAPON5': '0.200', 'HITCOUNT': '0.210', 'ARMOR': '0.424', 'FRAGCOUNT': '0.500', 'WEAPON3': '0.700', 'DAMAGECOUNT': '1.344', 'weapon3': '1.546', 'weapon2': '2.176'} [2024-08-05 07:13:28,440][00139] DAMAGECOUNT value on done: 62388.0 [2024-08-05 07:13:28,441][00139] Sum rewards: 0.548, reward structure: {'DEATHCOUNT': '-6.750', 'HEALTH': '-0.246', 'AMMO2': '0.001', 'AMMO5': '0.003', 'AMMO4': '0.007', 'weapon5': '0.044', 'WEAPON5': '0.050', 'AMMO3': '0.087', 'HITCOUNT': '0.200', 'WEAPON3': '0.500', 'ARMOR': '0.906', 'DAMAGECOUNT': '1.179', 'weapon2': '1.274', 'weapon3': '1.292', 'FRAGCOUNT': '2.000'} [2024-08-05 07:13:30,483][00034] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1110.8). Total num frames: 6176768. Throughput: 0: 284.5. Samples: 1545515. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 07:13:30,486][00034] Avg episode reward: [(0, '-3.040')] [2024-08-05 07:13:30,487][00132] Saving new best policy, reward=-3.040! [2024-08-05 07:13:35,483][00034] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1138.5). Total num frames: 6184960. Throughput: 0: 283.6. Samples: 1547225. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 07:13:35,484][00034] Avg episode reward: [(0, '-3.040')] [2024-08-05 07:13:35,493][00132] Saving /kaggle/working/train_dir/default_experiment/checkpoint_p0/checkpoint_000000755_6184960.pth... [2024-08-05 07:13:35,566][00132] Removing /kaggle/working/train_dir/default_experiment/checkpoint_p0/checkpoint_000000722_5914624.pth [2024-08-05 07:13:40,483][00034] Fps is (10 sec: 1638.4, 60 sec: 1092.3, 300 sec: 1138.5). Total num frames: 6193152. Throughput: 0: 282.2. Samples: 1548900. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 07:13:40,484][00034] Avg episode reward: [(0, '-3.040')] [2024-08-05 07:13:43,018][00139] DAMAGECOUNT value on done: 59837.0 [2024-08-05 07:13:43,019][00139] Sum rewards: -2.137, reward structure: {'DEATHCOUNT': '-7.500', 'HEALTH': '-0.907', 'AMMO5': '0.015', 'AMMO2': '0.020', 'ARMOR': '0.032', 'AMMO4': '0.099', 'weapon5': '0.110', 'AMMO3': '0.111', 'WEAPON4': '0.150', 'HITCOUNT': '0.210', 'WEAPON5': '0.250', 'weapon4': '0.304', 'DAMAGECOUNT': '0.615', 'WEAPON3': '0.700', 'FRAGCOUNT': '1.000', 'weapon2': '1.010', 'weapon3': '1.644'} [2024-08-05 07:13:43,237][00139] DAMAGECOUNT value on done: 62754.0 [2024-08-05 07:13:43,238][00139] Sum rewards: 2.145, reward structure: {'DEATHCOUNT': '-6.000', 'HEALTH': '-0.114', 'AMMO5': '0.005', 'AMMO2': '0.011', 'weapon5': '0.026', 'WEAPON4': '0.050', 'AMMO4': '0.055', 'ARMOR': '0.060', 'WEAPON5': '0.100', 'AMMO3': '0.129', 'HITCOUNT': '0.260', 'WEAPON3': '0.500', 'DAMAGECOUNT': '1.098', 'weapon2': '1.402', 'weapon3': '1.562', 'FRAGCOUNT': '3.000'} [2024-08-05 07:13:45,483][00034] Fps is (10 sec: 1638.4, 60 sec: 1228.8, 300 sec: 1138.5). Total num frames: 6201344. Throughput: 0: 282.4. Samples: 1549767. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 07:13:45,486][00034] Avg episode reward: [(0, '-2.943')] [2024-08-05 07:13:45,493][00132] Saving new best policy, reward=-2.943! [2024-08-05 07:13:50,483][00034] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1110.8). Total num frames: 6201344. Throughput: 0: 281.4. Samples: 1551408. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 07:13:50,484][00034] Avg episode reward: [(0, '-2.943')] [2024-08-05 07:13:55,483][00034] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1138.5). Total num frames: 6209536. Throughput: 0: 281.6. Samples: 1553103. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 07:13:55,485][00034] Avg episode reward: [(0, '-2.943')] [2024-08-05 07:13:57,958][00139] DAMAGECOUNT value on done: 60269.0 [2024-08-05 07:13:57,958][00139] Sum rewards: 3.939, reward structure: {'DEATHCOUNT': '-5.250', 'HEALTH': '-0.690', 'AMMO4': '-0.020', 'AMMO2': '-0.004', 'AMMO5': '0.011', 'weapon7': '0.084', 'AMMO6': '0.100', 'AMMO7': '0.100', 'WEAPON7': '0.100', 'weapon5': '0.108', 'AMMO3': '0.119', 'HITCOUNT': '0.170', 'WEAPON5': '0.200', 'WEAPON3': '0.650', 'weapon2': '1.096', 'DAMAGECOUNT': '1.296', 'weapon3': '2.368', 'FRAGCOUNT': '3.500'} [2024-08-05 07:13:58,183][00139] DAMAGECOUNT value on done: 63004.0 [2024-08-05 07:13:58,184][00139] Sum rewards: 1.901, reward structure: {'DEATHCOUNT': '-6.750', 'HEALTH': '-1.480', 'AMMO2': '0.002', 'AMMO4': '0.008', 'ARMOR': '0.028', 'AMMO5': '0.030', 'weapon5': '0.036', 'AMMO3': '0.113', 'HITCOUNT': '0.270', 'WEAPON5': '0.450', 'DAMAGECOUNT': '0.750', 'WEAPON3': '0.750', 'weapon2': '1.656', 'weapon3': '2.038', 'FRAGCOUNT': '4.000'} [2024-08-05 07:14:00,483][00034] Fps is (10 sec: 1638.4, 60 sec: 1228.8, 300 sec: 1138.5). Total num frames: 6217728. Throughput: 0: 283.0. Samples: 1553985. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 07:14:00,484][00034] Avg episode reward: [(0, '-2.805')] [2024-08-05 07:14:00,487][00132] Saving new best policy, reward=-2.805! [2024-08-05 07:14:05,483][00034] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1110.8). Total num frames: 6217728. Throughput: 0: 282.2. Samples: 1555670. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 07:14:05,485][00034] Avg episode reward: [(0, '-2.805')] [2024-08-05 07:14:07,000][00138] Updated weights for policy 0, policy_version 760 (0.0016) [2024-08-05 07:14:10,483][00034] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1138.5). Total num frames: 6225920. Throughput: 0: 282.3. Samples: 1557370. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 07:14:10,485][00034] Avg episode reward: [(0, '-2.805')] [2024-08-05 07:14:12,784][00139] DAMAGECOUNT value on done: 60587.0 [2024-08-05 07:14:12,785][00139] Sum rewards: 2.019, reward structure: {'DEATHCOUNT': '-6.000', 'HEALTH': '-2.014', 'AMMO4': '-0.003', 'AMMO2': '-0.001', 'AMMO5': '0.010', 'weapon7': '0.072', 'WEAPON5': '0.100', 'AMMO3': '0.110', 'AMMO6': '0.120', 'AMMO7': '0.120', 'WEAPON7': '0.200', 'HITCOUNT': '0.210', 'ARMOR': '0.425', 'WEAPON3': '0.800', 'DAMAGECOUNT': '0.954', 'weapon2': '1.136', 'weapon3': '1.780', 'FRAGCOUNT': '4.000'} [2024-08-05 07:14:13,003][00139] DAMAGECOUNT value on done: 63189.0 [2024-08-05 07:14:15,483][00034] Fps is (10 sec: 1638.3, 60 sec: 1228.8, 300 sec: 1138.5). Total num frames: 6234112. Throughput: 0: 282.7. Samples: 1558235. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 07:14:15,487][00034] Avg episode reward: [(0, '-2.665')] [2024-08-05 07:14:15,498][00132] Saving new best policy, reward=-2.665! [2024-08-05 07:14:20,483][00034] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1110.8). Total num frames: 6234112. Throughput: 0: 282.0. Samples: 1559914. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 07:14:20,484][00034] Avg episode reward: [(0, '-2.665')] [2024-08-05 07:14:25,483][00034] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1138.5). Total num frames: 6242304. Throughput: 0: 284.5. Samples: 1561701. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 07:14:25,485][00034] Avg episode reward: [(0, '-2.665')] [2024-08-05 07:14:27,339][00139] DAMAGECOUNT value on done: 60647.0 [2024-08-05 07:14:27,339][00139] Sum rewards: -0.950, reward structure: {'DEATHCOUNT': '-4.500', 'HEALTH': '-1.278', 'AMMO2': '0.001', 'AMMO4': '0.004', 'weapon4': '0.040', 'AMMO3': '0.063', 'HITCOUNT': '0.080', 'WEAPON4': '0.100', 'ARMOR': '0.166', 'DAMAGECOUNT': '0.180', 'WEAPON3': '0.500', 'FRAGCOUNT': '1.000', 'weapon2': '1.276', 'weapon3': '1.418'} [2024-08-05 07:14:27,569][00139] DAMAGECOUNT value on done: 63673.0 [2024-08-05 07:14:27,569][00139] Sum rewards: 0.429, reward structure: {'DEATHCOUNT': '-8.250', 'HEALTH': '-0.182', 'weapon5': '0.002', 'AMMO5': '0.010', 'ARMOR': '0.016', 'AMMO2': '0.018', 'weapon4': '0.056', 'AMMO4': '0.088', 'weapon7': '0.092', 'AMMO6': '0.100', 'AMMO7': '0.100', 'WEAPON7': '0.100', 'WEAPON4': '0.100', 'AMMO3': '0.131', 'WEAPON5': '0.200', 'HITCOUNT': '0.430', 'WEAPON3': '0.750', 'weapon2': '1.390', 'DAMAGECOUNT': '1.452', 'weapon3': '1.826', 'FRAGCOUNT': '2.000'} [2024-08-05 07:14:30,483][00034] Fps is (10 sec: 1638.4, 60 sec: 1228.8, 300 sec: 1138.5). Total num frames: 6250496. Throughput: 0: 284.4. Samples: 1562563. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 07:14:30,484][00034] Avg episode reward: [(0, '-2.667')] [2024-08-05 07:14:35,483][00034] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1110.8). Total num frames: 6250496. Throughput: 0: 286.0. Samples: 1564277. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 07:14:35,485][00034] Avg episode reward: [(0, '-2.667')] [2024-08-05 07:14:40,483][00034] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1110.8). Total num frames: 6258688. Throughput: 0: 287.6. Samples: 1566047. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 07:14:40,484][00034] Avg episode reward: [(0, '-2.667')] [2024-08-05 07:14:41,837][00139] DAMAGECOUNT value on done: 61018.0 [2024-08-05 07:14:41,838][00139] Sum rewards: 0.179, reward structure: {'DEATHCOUNT': '-8.250', 'HEALTH': '-1.176', 'AMMO5': '0.010', 'AMMO2': '0.013', 'weapon5': '0.050', 'AMMO4': '0.065', 'AMMO3': '0.134', 'WEAPON4': '0.200', 'WEAPON5': '0.250', 'weapon4': '0.252', 'HITCOUNT': '0.290', 'ARMOR': '0.517', 'WEAPON3': '0.800', 'weapon2': '1.014', 'DAMAGECOUNT': '1.113', 'weapon3': '2.396', 'FRAGCOUNT': '2.500'} [2024-08-05 07:14:42,067][00139] DAMAGECOUNT value on done: 63948.0 [2024-08-05 07:14:42,067][00139] Sum rewards: -6.582, reward structure: {'DEATHCOUNT': '-12.000', 'HEALTH': '-2.809', 'weapon5': '0.010', 'AMMO2': '0.022', 'AMMO5': '0.025', 'ARMOR': '0.028', 'weapon7': '0.078', 'AMMO4': '0.112', 'AMMO6': '0.120', 'AMMO7': '0.120', 'weapon4': '0.156', 'AMMO3': '0.175', 'WEAPON4': '0.200', 'WEAPON7': '0.200', 'HITCOUNT': '0.270', 'WEAPON5': '0.350', 'DAMAGECOUNT': '0.825', 'WEAPON3': '0.900', 'FRAGCOUNT': '1.000', 'weapon3': '1.550', 'weapon2': '2.086'} [2024-08-05 07:14:45,483][00034] Fps is (10 sec: 1638.4, 60 sec: 1092.3, 300 sec: 1138.5). Total num frames: 6266880. Throughput: 0: 286.7. Samples: 1566888. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 07:14:45,484][00034] Avg episode reward: [(0, '-2.629')] [2024-08-05 07:14:45,492][00132] Saving new best policy, reward=-2.629! [2024-08-05 07:14:50,483][00034] Fps is (10 sec: 1638.4, 60 sec: 1228.8, 300 sec: 1138.5). Total num frames: 6275072. Throughput: 0: 286.8. Samples: 1568576. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 07:14:50,484][00034] Avg episode reward: [(0, '-2.629')] [2024-08-05 07:14:55,483][00034] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1110.8). Total num frames: 6275072. Throughput: 0: 287.6. Samples: 1570314. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 07:14:55,485][00034] Avg episode reward: [(0, '-2.629')] [2024-08-05 07:14:56,560][00139] DAMAGECOUNT value on done: 61183.0 [2024-08-05 07:14:56,561][00139] Sum rewards: -1.786, reward structure: {'DEATHCOUNT': '-8.250', 'HEALTH': '-0.405', 'weapon5': '0.008', 'AMMO2': '0.008', 'AMMO5': '0.013', 'weapon7': '0.024', 'ARMOR': '0.040', 'AMMO4': '0.041', 'AMMO6': '0.100', 'WEAPON7': '0.100', 'AMMO7': '0.100', 'AMMO3': '0.102', 'HITCOUNT': '0.160', 'WEAPON5': '0.250', 'DAMAGECOUNT': '0.495', 'WEAPON3': '0.700', 'FRAGCOUNT': '1.000', 'weapon3': '1.856', 'weapon2': '1.872'} [2024-08-05 07:14:56,798][00139] DAMAGECOUNT value on done: 64351.0 [2024-08-05 07:14:56,799][00139] Sum rewards: 4.367, reward structure: {'DEATHCOUNT': '-6.750', 'HEALTH': '-0.066', 'weapon5': '0.002', 'AMMO5': '0.003', 'AMMO2': '0.004', 'AMMO4': '0.021', 'weapon4': '0.044', 'WEAPON4': '0.050', 'WEAPON5': '0.050', 'weapon7': '0.078', 'AMMO3': '0.096', 'AMMO6': '0.120', 'AMMO7': '0.120', 'WEAPON7': '0.200', 'HITCOUNT': '0.290', 'ARMOR': '0.428', 'WEAPON3': '0.550', 'DAMAGECOUNT': '1.209', 'weapon2': '1.376', 'weapon3': '1.542', 'FRAGCOUNT': '5.000'} [2024-08-05 07:15:00,483][00034] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1138.5). Total num frames: 6283264. Throughput: 0: 287.4. Samples: 1571170. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 07:15:00,484][00034] Avg episode reward: [(0, '-2.433')] [2024-08-05 07:15:00,486][00132] Saving new best policy, reward=-2.433! [2024-08-05 07:15:05,483][00034] Fps is (10 sec: 1638.4, 60 sec: 1228.8, 300 sec: 1138.5). Total num frames: 6291456. Throughput: 0: 288.5. Samples: 1572897. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 07:15:05,484][00034] Avg episode reward: [(0, '-2.433')] [2024-08-05 07:15:10,483][00034] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1110.8). Total num frames: 6291456. Throughput: 0: 287.0. Samples: 1574616. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 07:15:10,485][00034] Avg episode reward: [(0, '-2.433')] [2024-08-05 07:15:11,261][00139] DAMAGECOUNT value on done: 61624.0 [2024-08-05 07:15:11,262][00139] Sum rewards: -3.808, reward structure: {'DEATHCOUNT': '-9.750', 'HEALTH': '-0.507', 'FRAGCOUNT': '-0.500', 'ARMOR': '0.008', 'AMMO5': '0.017', 'AMMO2': '0.020', 'WEAPON1': '0.030', 'WEAPON4': '0.050', 'AMMO4': '0.098', 'weapon5': '0.132', 'AMMO3': '0.153', 'weapon4': '0.154', 'HITCOUNT': '0.320', 'WEAPON5': '0.350', 'WEAPON3': '0.900', 'DAMAGECOUNT': '1.323', 'weapon2': '1.362', 'weapon3': '2.032'} [2024-08-05 07:15:11,492][00139] DAMAGECOUNT value on done: 64703.0 [2024-08-05 07:15:11,493][00139] Sum rewards: -0.837, reward structure: {'DEATHCOUNT': '-8.250', 'HEALTH': '-2.177', 'AMMO5': '0.007', 'AMMO2': '0.024', 'weapon7': '0.074', 'weapon5': '0.078', 'ARMOR': '0.088', 'AMMO6': '0.100', 'AMMO7': '0.100', 'WEAPON7': '0.100', 'AMMO4': '0.122', 'AMMO3': '0.137', 'HITCOUNT': '0.180', 'WEAPON5': '0.200', 'WEAPON4': '0.250', 'weapon4': '0.368', 'WEAPON3': '0.950', 'weapon2': '0.998', 'DAMAGECOUNT': '1.011', 'weapon3': '1.802', 'FRAGCOUNT': '3.000'} [2024-08-05 07:15:15,483][00034] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1138.5). Total num frames: 6299648. Throughput: 0: 286.9. Samples: 1575473. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 07:15:15,484][00034] Avg episode reward: [(0, '-2.356')] [2024-08-05 07:15:15,492][00132] Saving new best policy, reward=-2.356! [2024-08-05 07:15:18,372][00138] Updated weights for policy 0, policy_version 770 (0.0018) [2024-08-05 07:15:20,483][00034] Fps is (10 sec: 1638.4, 60 sec: 1228.8, 300 sec: 1138.5). Total num frames: 6307840. Throughput: 0: 286.0. Samples: 1577145. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 07:15:20,484][00034] Avg episode reward: [(0, '-2.356')] [2024-08-05 07:15:25,483][00034] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1110.8). Total num frames: 6307840. Throughput: 0: 285.0. Samples: 1578873. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 07:15:25,484][00034] Avg episode reward: [(0, '-2.356')] [2024-08-05 07:15:26,123][00139] DAMAGECOUNT value on done: 62098.0 [2024-08-05 07:15:26,123][00139] Sum rewards: 1.735, reward structure: {'DEATHCOUNT': '-6.750', 'HEALTH': '-1.730', 'AMMO5': '0.005', 'WEAPON1': '0.020', 'AMMO2': '0.035', 'ARMOR': '0.072', 'weapon5': '0.090', 'AMMO3': '0.102', 'WEAPON5': '0.150', 'AMMO4': '0.177', 'HITCOUNT': '0.240', 'weapon4': '0.352', 'WEAPON4': '0.400', 'WEAPON3': '0.700', 'weapon2': '0.946', 'DAMAGECOUNT': '1.422', 'weapon3': '1.504', 'FRAGCOUNT': '4.000'} [2024-08-05 07:15:26,334][00139] DAMAGECOUNT value on done: 65173.0 [2024-08-05 07:15:26,334][00139] Sum rewards: 0.724, reward structure: {'DEATHCOUNT': '-9.750', 'HEALTH': '-0.620', 'AMMO5': '0.007', 'AMMO2': '0.015', 'AMMO4': '0.075', 'WEAPON4': '0.100', 'weapon5': '0.128', 'WEAPON5': '0.150', 'weapon4': '0.172', 'AMMO3': '0.187', 'HITCOUNT': '0.210', 'ARMOR': '0.400', 'WEAPON3': '0.800', 'weapon3': '1.390', 'DAMAGECOUNT': '1.410', 'weapon2': '2.050', 'FRAGCOUNT': '4.000'} [2024-08-05 07:15:30,483][00034] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1138.5). Total num frames: 6316032. Throughput: 0: 285.3. Samples: 1579727. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 07:15:30,484][00034] Avg episode reward: [(0, '-2.221')] [2024-08-05 07:15:30,486][00132] Saving new best policy, reward=-2.221! [2024-08-05 07:15:35,483][00034] Fps is (10 sec: 1638.4, 60 sec: 1228.8, 300 sec: 1138.5). Total num frames: 6324224. Throughput: 0: 284.9. Samples: 1581396. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 07:15:35,485][00034] Avg episode reward: [(0, '-2.221')] [2024-08-05 07:15:35,492][00132] Saving /kaggle/working/train_dir/default_experiment/checkpoint_p0/checkpoint_000000772_6324224.pth... [2024-08-05 07:15:35,566][00132] Removing /kaggle/working/train_dir/default_experiment/checkpoint_p0/checkpoint_000000739_6053888.pth [2024-08-05 07:15:40,483][00034] Fps is (10 sec: 1638.4, 60 sec: 1228.8, 300 sec: 1138.5). Total num frames: 6332416. Throughput: 0: 283.9. Samples: 1583089. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 07:15:40,484][00034] Avg episode reward: [(0, '-2.221')] [2024-08-05 07:15:41,062][00139] DAMAGECOUNT value on done: 62420.0 [2024-08-05 07:15:41,062][00139] Sum rewards: 1.017, reward structure: {'DEATHCOUNT': '-6.750', 'HEALTH': '-0.420', 'AMMO2': '0.005', 'AMMO5': '0.010', 'weapon4': '0.020', 'ARMOR': '0.025', 'AMMO4': '0.027', 'weapon7': '0.048', 'WEAPON4': '0.050', 'weapon5': '0.050', 'AMMO3': '0.100', 'AMMO6': '0.100', 'AMMO7': '0.100', 'WEAPON7': '0.100', 'WEAPON5': '0.200', 'HITCOUNT': '0.310', 'WEAPON3': '0.550', 'DAMAGECOUNT': '0.966', 'weapon2': '1.726', 'weapon3': '1.800', 'FRAGCOUNT': '2.000'} [2024-08-05 07:15:41,284][00139] DAMAGECOUNT value on done: 65338.0 [2024-08-05 07:15:45,483][00034] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1138.5). Total num frames: 6332416. Throughput: 0: 283.4. Samples: 1583923. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 07:15:45,484][00034] Avg episode reward: [(0, '-2.133')] [2024-08-05 07:15:45,492][00132] Saving new best policy, reward=-2.133! [2024-08-05 07:15:50,483][00034] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1138.5). Total num frames: 6340608. Throughput: 0: 283.8. Samples: 1585670. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 07:15:50,486][00034] Avg episode reward: [(0, '-2.133')] [2024-08-05 07:15:55,483][00034] Fps is (10 sec: 1638.4, 60 sec: 1228.8, 300 sec: 1138.5). Total num frames: 6348800. Throughput: 0: 283.3. Samples: 1587366. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 07:15:55,484][00034] Avg episode reward: [(0, '-2.133')] [2024-08-05 07:15:55,724][00139] DAMAGECOUNT value on done: 62819.0 [2024-08-05 07:15:55,725][00139] Sum rewards: -6.451, reward structure: {'DEATHCOUNT': '-10.500', 'HEALTH': '-1.750', 'FRAGCOUNT': '-0.500', 'AMMO4': '-0.041', 'AMMO2': '-0.008', 'AMMO5': '0.009', 'weapon5': '0.024', 'AMMO3': '0.186', 'WEAPON5': '0.200', 'HITCOUNT': '0.320', 'WEAPON3': '0.850', 'DAMAGECOUNT': '1.197', 'weapon3': '1.750', 'weapon2': '1.812'} [2024-08-05 07:15:55,957][00139] DAMAGECOUNT value on done: 65623.0 [2024-08-05 07:15:55,958][00139] Sum rewards: -5.414, reward structure: {'DEATHCOUNT': '-12.750', 'HEALTH': '-0.985', 'AMMO2': '0.019', 'AMMO5': '0.019', 'weapon5': '0.026', 'ARMOR': '0.056', 'AMMO4': '0.097', 'WEAPON4': '0.100', 'AMMO3': '0.184', 'WEAPON5': '0.200', 'HITCOUNT': '0.300', 'DAMAGECOUNT': '0.855', 'WEAPON3': '0.950', 'weapon2': '1.304', 'FRAGCOUNT': '2.000', 'weapon3': '2.210'} [2024-08-05 07:16:00,483][00034] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1110.8). Total num frames: 6348800. Throughput: 0: 283.3. Samples: 1588223. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 07:16:00,484][00034] Avg episode reward: [(0, '-2.171')] [2024-08-05 07:16:05,483][00034] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1138.5). Total num frames: 6356992. Throughput: 0: 283.0. Samples: 1589882. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 07:16:05,484][00034] Avg episode reward: [(0, '-2.171')] [2024-08-05 07:16:10,483][00034] Fps is (10 sec: 1638.4, 60 sec: 1228.8, 300 sec: 1138.5). Total num frames: 6365184. Throughput: 0: 281.1. Samples: 1591524. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 07:16:10,484][00034] Avg episode reward: [(0, '-2.171')] [2024-08-05 07:16:10,872][00139] DAMAGECOUNT value on done: 62844.0 [2024-08-05 07:16:11,079][00139] DAMAGECOUNT value on done: 65813.0 [2024-08-05 07:16:11,080][00139] Sum rewards: -0.727, reward structure: {'DEATHCOUNT': '-7.500', 'ARMOR': '0.008', 'AMMO5': '0.012', 'WEAPON1': '0.020', 'AMMO2': '0.029', 'AMMO3': '0.111', 'AMMO4': '0.147', 'HITCOUNT': '0.190', 'WEAPON4': '0.200', 'weapon5': '0.224', 'HEALTH': '0.233', 'WEAPON5': '0.250', 'weapon4': '0.300', 'WEAPON3': '0.500', 'DAMAGECOUNT': '0.570', 'FRAGCOUNT': '1.000', 'weapon2': '1.182', 'weapon3': '1.796'} [2024-08-05 07:16:15,483][00034] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1110.8). Total num frames: 6365184. Throughput: 0: 281.4. Samples: 1592392. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 07:16:15,484][00034] Avg episode reward: [(0, '-2.154')] [2024-08-05 07:16:20,483][00034] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1138.5). Total num frames: 6373376. Throughput: 0: 282.6. Samples: 1594113. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 07:16:20,485][00034] Avg episode reward: [(0, '-2.154')] [2024-08-05 07:16:25,483][00034] Fps is (10 sec: 1638.4, 60 sec: 1228.8, 300 sec: 1138.5). Total num frames: 6381568. Throughput: 0: 282.2. Samples: 1595790. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 07:16:25,485][00034] Avg episode reward: [(0, '-2.154')] [2024-08-05 07:16:25,659][00139] DAMAGECOUNT value on done: 63429.0 [2024-08-05 07:16:25,660][00139] Sum rewards: -1.206, reward structure: {'DEATHCOUNT': '-11.250', 'HEALTH': '-1.314', 'AMMO5': '0.007', 'WEAPON1': '0.010', 'AMMO2': '0.012', 'ARMOR': '0.016', 'AMMO4': '0.059', 'weapon5': '0.130', 'AMMO3': '0.146', 'WEAPON5': '0.150', 'weapon4': '0.184', 'WEAPON4': '0.200', 'HITCOUNT': '0.470', 'WEAPON3': '0.850', 'weapon3': '1.594', 'DAMAGECOUNT': '1.755', 'weapon2': '1.774', 'FRAGCOUNT': '4.000'} [2024-08-05 07:16:25,903][00139] DAMAGECOUNT value on done: 66196.0 [2024-08-05 07:16:25,904][00139] Sum rewards: -2.845, reward structure: {'DEATHCOUNT': '-11.250', 'HEALTH': '-2.610', 'AMMO2': '0.001', 'AMMO4': '0.006', 'weapon5': '0.008', 'AMMO5': '0.013', 'ARMOR': '0.040', 'weapon4': '0.068', 'WEAPON4': '0.100', 'AMMO3': '0.230', 'WEAPON5': '0.250', 'HITCOUNT': '0.270', 'DAMAGECOUNT': '1.149', 'WEAPON3': '1.300', 'weapon2': '1.394', 'weapon3': '2.186', 'FRAGCOUNT': '4.000'} [2024-08-05 07:16:30,483][00034] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1110.8). Total num frames: 6381568. Throughput: 0: 282.3. Samples: 1596626. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 07:16:30,484][00034] Avg episode reward: [(0, '-2.096')] [2024-08-05 07:16:30,486][00132] Saving new best policy, reward=-2.096! [2024-08-05 07:16:31,189][00138] Updated weights for policy 0, policy_version 780 (0.0017) [2024-08-05 07:16:35,483][00034] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1138.5). Total num frames: 6389760. Throughput: 0: 281.2. Samples: 1598322. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 07:16:35,484][00034] Avg episode reward: [(0, '-2.096')] [2024-08-05 07:16:40,483][00034] Fps is (10 sec: 1638.4, 60 sec: 1092.3, 300 sec: 1138.5). Total num frames: 6397952. Throughput: 0: 281.1. Samples: 1600014. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 07:16:40,484][00034] Avg episode reward: [(0, '-2.096')] [2024-08-05 07:16:40,550][00139] DAMAGECOUNT value on done: 63684.0 [2024-08-05 07:16:40,550][00139] Sum rewards: -5.000, reward structure: {'DEATHCOUNT': '-11.250', 'HEALTH': '-1.155', 'AMMO2': '0.007', 'AMMO5': '0.018', 'AMMO4': '0.036', 'ARMOR': '0.048', 'weapon5': '0.138', 'AMMO3': '0.176', 'WEAPON4': '0.200', 'HITCOUNT': '0.230', 'weapon4': '0.288', 'WEAPON5': '0.300', 'DAMAGECOUNT': '0.765', 'WEAPON3': '0.900', 'FRAGCOUNT': '1.000', 'weapon2': '1.438', 'weapon3': '1.860'} [2024-08-05 07:16:40,780][00139] DAMAGECOUNT value on done: 66648.0 [2024-08-05 07:16:40,780][00139] Sum rewards: 0.551, reward structure: {'DEATHCOUNT': '-7.500', 'HEALTH': '-0.902', 'WEAPON1': '0.010', 'AMMO2': '0.013', 'AMMO5': '0.015', 'ARMOR': '0.040', 'AMMO4': '0.065', 'weapon5': '0.076', 'WEAPON4': '0.150', 'AMMO3': '0.161', 'weapon4': '0.204', 'WEAPON5': '0.250', 'HITCOUNT': '0.350', 'WEAPON3': '0.800', 'DAMAGECOUNT': '1.356', 'weapon2': '1.392', 'FRAGCOUNT': '2.000', 'weapon3': '2.070'} [2024-08-05 07:16:45,483][00034] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1110.8). Total num frames: 6397952. Throughput: 0: 281.3. Samples: 1600883. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 07:16:45,484][00034] Avg episode reward: [(0, '-2.066')] [2024-08-05 07:16:45,511][00132] Saving new best policy, reward=-2.066! [2024-08-05 07:16:50,483][00034] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1138.5). Total num frames: 6406144. Throughput: 0: 282.2. Samples: 1602583. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 07:16:50,485][00034] Avg episode reward: [(0, '-2.066')] [2024-08-05 07:16:55,407][00139] DAMAGECOUNT value on done: 64037.0 [2024-08-05 07:16:55,408][00139] Sum rewards: -1.354, reward structure: {'DEATHCOUNT': '-9.750', 'HEALTH': '-1.035', 'ARMOR': '0.008', 'AMMO2': '0.020', 'weapon7': '0.020', 'AMMO5': '0.025', 'AMMO4': '0.099', 'WEAPON4': '0.100', 'AMMO6': '0.100', 'WEAPON7': '0.100', 'AMMO7': '0.100', 'AMMO3': '0.142', 'weapon5': '0.148', 'weapon4': '0.202', 'HITCOUNT': '0.280', 'WEAPON5': '0.350', 'WEAPON3': '0.800', 'DAMAGECOUNT': '1.059', 'weapon2': '1.616', 'weapon3': '1.762', 'FRAGCOUNT': '2.500'} [2024-08-05 07:16:55,483][00034] Fps is (10 sec: 1638.4, 60 sec: 1092.3, 300 sec: 1138.5). Total num frames: 6414336. Throughput: 0: 283.0. Samples: 1604260. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 07:16:55,484][00034] Avg episode reward: [(0, '-1.972')] [2024-08-05 07:16:55,492][00132] Saving new best policy, reward=-1.972! [2024-08-05 07:16:55,676][00139] DAMAGECOUNT value on done: 67138.0 [2024-08-05 07:16:55,677][00139] Sum rewards: -1.809, reward structure: {'DEATHCOUNT': '-9.000', 'HEALTH': '-1.260', 'weapon5': '0.002', 'AMMO2': '0.006', 'AMMO5': '0.007', 'AMMO4': '0.028', 'WEAPON5': '0.150', 'AMMO3': '0.167', 'HITCOUNT': '0.460', 'WEAPON3': '0.900', 'weapon2': '1.440', 'DAMAGECOUNT': '1.470', 'weapon3': '1.820', 'FRAGCOUNT': '2.000'} [2024-08-05 07:17:00,483][00034] Fps is (10 sec: 1638.4, 60 sec: 1228.8, 300 sec: 1138.5). Total num frames: 6422528. Throughput: 0: 282.0. Samples: 1605084. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 07:17:00,485][00034] Avg episode reward: [(0, '-1.934')] [2024-08-05 07:17:00,487][00132] Saving new best policy, reward=-1.934! [2024-08-05 07:17:05,483][00034] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1138.5). Total num frames: 6422528. Throughput: 0: 281.7. Samples: 1606788. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 07:17:05,485][00034] Avg episode reward: [(0, '-1.934')] [2024-08-05 07:17:10,150][00139] DAMAGECOUNT value on done: 64357.0 [2024-08-05 07:17:10,151][00139] Sum rewards: -5.109, reward structure: {'DEATHCOUNT': '-10.500', 'HEALTH': '-1.366', 'AMMO2': '0.002', 'AMMO5': '0.004', 'AMMO4': '0.008', 'ARMOR': '0.032', 'weapon4': '0.032', 'weapon5': '0.038', 'WEAPON4': '0.050', 'weapon7': '0.052', 'WEAPON5': '0.100', 'AMMO3': '0.117', 'AMMO6': '0.120', 'AMMO7': '0.120', 'WEAPON7': '0.200', 'HITCOUNT': '0.200', 'WEAPON3': '0.700', 'DAMAGECOUNT': '0.960', 'FRAGCOUNT': '1.000', 'weapon3': '1.358', 'weapon2': '1.664'} [2024-08-05 07:17:10,368][00139] DAMAGECOUNT value on done: 67626.0 [2024-08-05 07:17:10,368][00139] Sum rewards: -1.642, reward structure: {'DEATHCOUNT': '-9.000', 'HEALTH': '-0.876', 'AMMO5': '0.005', 'AMMO2': '0.017', 'WEAPON1': '0.020', 'AMMO4': '0.085', 'weapon4': '0.092', 'WEAPON5': '0.100', 'ARMOR': '0.104', 'AMMO3': '0.124', 'weapon5': '0.130', 'WEAPON4': '0.150', 'HITCOUNT': '0.220', 'WEAPON3': '0.800', 'DAMAGECOUNT': '1.464', 'FRAGCOUNT': '1.500', 'weapon2': '1.622', 'weapon3': '1.800'} [2024-08-05 07:17:10,483][00034] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1138.5). Total num frames: 6430720. Throughput: 0: 283.1. Samples: 1608530. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 07:17:10,484][00034] Avg episode reward: [(0, '-1.941')] [2024-08-05 07:17:15,483][00034] Fps is (10 sec: 1638.4, 60 sec: 1228.8, 300 sec: 1138.5). Total num frames: 6438912. Throughput: 0: 284.4. Samples: 1609424. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 07:17:15,484][00034] Avg episode reward: [(0, '-1.941')] [2024-08-05 07:17:20,483][00034] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1138.5). Total num frames: 6438912. Throughput: 0: 285.4. Samples: 1611165. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 07:17:20,484][00034] Avg episode reward: [(0, '-1.941')] [2024-08-05 07:17:24,632][00139] DAMAGECOUNT value on done: 64901.0 [2024-08-05 07:17:24,633][00139] Sum rewards: -4.112, reward structure: {'DEATHCOUNT': '-11.250', 'HEALTH': '-1.734', 'AMMO2': '0.007', 'AMMO5': '0.010', 'ARMOR': '0.032', 'AMMO4': '0.033', 'weapon5': '0.066', 'AMMO3': '0.196', 'WEAPON5': '0.250', 'HITCOUNT': '0.380', 'WEAPON3': '1.050', 'weapon2': '1.310', 'FRAGCOUNT': '1.500', 'DAMAGECOUNT': '1.632', 'weapon3': '2.406'} [2024-08-05 07:17:24,869][00139] DAMAGECOUNT value on done: 67681.0 [2024-08-05 07:17:25,483][00034] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1138.5). Total num frames: 6447104. Throughput: 0: 285.6. Samples: 1612867. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 07:17:25,485][00034] Avg episode reward: [(0, '-1.883')] [2024-08-05 07:17:25,493][00132] Saving new best policy, reward=-1.883! [2024-08-05 07:17:30,483][00034] Fps is (10 sec: 1638.4, 60 sec: 1228.8, 300 sec: 1138.5). Total num frames: 6455296. Throughput: 0: 284.8. Samples: 1613701. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 07:17:30,484][00034] Avg episode reward: [(0, '-1.883')] [2024-08-05 07:17:35,483][00034] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1110.8). Total num frames: 6455296. Throughput: 0: 285.2. Samples: 1615417. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 07:17:35,484][00034] Avg episode reward: [(0, '-1.883')] [2024-08-05 07:17:35,492][00132] Saving /kaggle/working/train_dir/default_experiment/checkpoint_p0/checkpoint_000000788_6455296.pth... [2024-08-05 07:17:35,565][00132] Removing /kaggle/working/train_dir/default_experiment/checkpoint_p0/checkpoint_000000755_6184960.pth [2024-08-05 07:17:39,473][00139] DAMAGECOUNT value on done: 65392.0 [2024-08-05 07:17:39,473][00139] Sum rewards: -4.933, reward structure: {'DEATHCOUNT': '-15.000', 'HEALTH': '-2.640', 'AMMO5': '0.015', 'AMMO2': '0.020', 'AMMO4': '0.102', 'WEAPON5': '0.150', 'weapon5': '0.162', 'AMMO3': '0.181', 'HITCOUNT': '0.240', 'WEAPON4': '0.250', 'weapon4': '0.264', 'ARMOR': '0.502', 'WEAPON3': '1.050', 'weapon2': '1.330', 'DAMAGECOUNT': '1.473', 'weapon3': '1.968', 'FRAGCOUNT': '5.000'} [2024-08-05 07:17:39,698][00139] DAMAGECOUNT value on done: 68181.0 [2024-08-05 07:17:39,698][00139] Sum rewards: 0.344, reward structure: {'DEATHCOUNT': '-9.750', 'HEALTH': '-1.207', 'AMMO2': '0.021', 'AMMO5': '0.027', 'weapon5': '0.100', 'AMMO4': '0.105', 'weapon4': '0.142', 'AMMO3': '0.154', 'WEAPON4': '0.200', 'HITCOUNT': '0.320', 'WEAPON5': '0.350', 'ARMOR': '0.848', 'WEAPON3': '1.000', 'weapon2': '1.308', 'DAMAGECOUNT': '1.500', 'weapon3': '2.226', 'FRAGCOUNT': '3.000'} [2024-08-05 07:17:40,483][00034] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1138.5). Total num frames: 6463488. Throughput: 0: 285.6. Samples: 1617113. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 07:17:40,484][00034] Avg episode reward: [(0, '-1.911')] [2024-08-05 07:17:42,961][00138] Updated weights for policy 0, policy_version 790 (0.0017) [2024-08-05 07:17:45,483][00034] Fps is (10 sec: 1638.4, 60 sec: 1228.8, 300 sec: 1138.5). Total num frames: 6471680. Throughput: 0: 286.5. Samples: 1617976. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 07:17:45,484][00034] Avg episode reward: [(0, '-1.911')] [2024-08-05 07:17:50,483][00034] Fps is (10 sec: 1638.4, 60 sec: 1228.8, 300 sec: 1138.5). Total num frames: 6479872. Throughput: 0: 287.2. Samples: 1619714. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 07:17:50,485][00034] Avg episode reward: [(0, '-1.911')] [2024-08-05 07:17:54,248][00139] DAMAGECOUNT value on done: 65777.0 [2024-08-05 07:17:54,248][00139] Sum rewards: -0.554, reward structure: {'DEATHCOUNT': '-8.250', 'HEALTH': '-1.390', 'AMMO2': '0.009', 'AMMO5': '0.009', 'WEAPON1': '0.020', 'weapon4': '0.032', 'AMMO4': '0.043', 'weapon5': '0.044', 'WEAPON4': '0.100', 'AMMO3': '0.122', 'WEAPON5': '0.200', 'HITCOUNT': '0.340', 'ARMOR': '0.550', 'WEAPON3': '0.700', 'DAMAGECOUNT': '1.155', 'weapon3': '1.620', 'FRAGCOUNT': '2.000', 'weapon2': '2.142'} [2024-08-05 07:17:54,498][00139] DAMAGECOUNT value on done: 68746.0 [2024-08-05 07:17:54,499][00139] Sum rewards: -1.169, reward structure: {'DEATHCOUNT': '-10.500', 'HEALTH': '-1.430', 'AMMO4': '-0.008', 'AMMO2': '-0.002', 'WEAPON4': '0.150', 'AMMO3': '0.157', 'weapon4': '0.214', 'HITCOUNT': '0.400', 'WEAPON3': '0.800', 'weapon2': '1.272', 'DAMAGECOUNT': '1.695', 'weapon3': '2.082', 'FRAGCOUNT': '4.000'} [2024-08-05 07:17:55,483][00034] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1138.5). Total num frames: 6479872. Throughput: 0: 285.4. Samples: 1621373. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 07:17:55,484][00034] Avg episode reward: [(0, '-1.800')] [2024-08-05 07:17:55,493][00132] Saving new best policy, reward=-1.800! [2024-08-05 07:18:00,483][00034] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1138.5). Total num frames: 6488064. Throughput: 0: 283.3. Samples: 1622173. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 07:18:00,484][00034] Avg episode reward: [(0, '-1.800')] [2024-08-05 07:18:05,483][00034] Fps is (10 sec: 1638.4, 60 sec: 1228.8, 300 sec: 1138.5). Total num frames: 6496256. Throughput: 0: 283.1. Samples: 1623903. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 07:18:05,484][00034] Avg episode reward: [(0, '-1.800')] [2024-08-05 07:18:09,174][00139] DAMAGECOUNT value on done: 65927.0 [2024-08-05 07:18:09,411][00139] DAMAGECOUNT value on done: 69478.0 [2024-08-05 07:18:09,411][00139] Sum rewards: 2.639, reward structure: {'DEATHCOUNT': '-6.750', 'HEALTH': '-1.050', 'AMMO2': '0.005', 'AMMO5': '0.007', 'AMMO4': '0.023', 'weapon7': '0.048', 'weapon5': '0.090', 'WEAPON4': '0.100', 'AMMO3': '0.104', 'AMMO6': '0.120', 'AMMO7': '0.120', 'WEAPON7': '0.200', 'WEAPON5': '0.200', 'weapon4': '0.212', 'HITCOUNT': '0.280', 'WEAPON3': '0.600', 'weapon2': '0.890', 'weapon3': '1.544', 'DAMAGECOUNT': '1.896', 'FRAGCOUNT': '4.000'} [2024-08-05 07:18:10,483][00034] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1138.5). Total num frames: 6496256. Throughput: 0: 282.8. Samples: 1625592. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 07:18:10,484][00034] Avg episode reward: [(0, '-1.742')] [2024-08-05 07:18:10,486][00132] Saving new best policy, reward=-1.742! [2024-08-05 07:18:15,483][00034] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1138.5). Total num frames: 6504448. Throughput: 0: 283.3. Samples: 1626451. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 07:18:15,484][00034] Avg episode reward: [(0, '-1.742')] [2024-08-05 07:18:20,483][00034] Fps is (10 sec: 1638.4, 60 sec: 1228.8, 300 sec: 1138.5). Total num frames: 6512640. Throughput: 0: 283.4. Samples: 1628172. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 07:18:20,484][00034] Avg episode reward: [(0, '-1.742')] [2024-08-05 07:18:23,771][00139] DAMAGECOUNT value on done: 66234.0 [2024-08-05 07:18:23,772][00139] Sum rewards: -2.505, reward structure: {'DEATHCOUNT': '-11.250', 'HEALTH': '-2.090', 'AMMO5': '0.007', 'WEAPON1': '0.010', 'AMMO2': '0.020', 'weapon5': '0.042', 'weapon7': '0.052', 'ARMOR': '0.076', 'AMMO4': '0.098', 'AMMO6': '0.120', 'AMMO7': '0.120', 'WEAPON5': '0.150', 'AMMO3': '0.178', 'WEAPON7': '0.200', 'WEAPON4': '0.200', 'HITCOUNT': '0.250', 'weapon4': '0.268', 'WEAPON3': '0.900', 'DAMAGECOUNT': '0.921', 'weapon3': '1.492', 'weapon2': '1.730', 'FRAGCOUNT': '4.000'} [2024-08-05 07:18:24,000][00139] DAMAGECOUNT value on done: 69886.0 [2024-08-05 07:18:24,000][00139] Sum rewards: -1.974, reward structure: {'DEATHCOUNT': '-9.750', 'HEALTH': '-1.937', 'AMMO2': '0.013', 'AMMO5': '0.013', 'ARMOR': '0.024', 'AMMO4': '0.065', 'weapon5': '0.118', 'AMMO3': '0.173', 'HITCOUNT': '0.190', 'WEAPON4': '0.200', 'weapon4': '0.232', 'WEAPON5': '0.250', 'WEAPON3': '0.800', 'DAMAGECOUNT': '1.224', 'weapon3': '1.560', 'weapon2': '1.850', 'FRAGCOUNT': '3.000'} [2024-08-05 07:18:25,483][00034] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1138.5). Total num frames: 6512640. Throughput: 0: 284.6. Samples: 1629918. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 07:18:25,484][00034] Avg episode reward: [(0, '-1.716')] [2024-08-05 07:18:25,492][00132] Saving new best policy, reward=-1.716! [2024-08-05 07:18:30,483][00034] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1138.5). Total num frames: 6520832. Throughput: 0: 284.1. Samples: 1630760. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 07:18:30,485][00034] Avg episode reward: [(0, '-1.716')] [2024-08-05 07:18:35,483][00034] Fps is (10 sec: 1638.4, 60 sec: 1228.8, 300 sec: 1138.5). Total num frames: 6529024. Throughput: 0: 281.8. Samples: 1632397. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 07:18:35,485][00034] Avg episode reward: [(0, '-1.716')] [2024-08-05 07:18:38,854][00139] DAMAGECOUNT value on done: 66583.0 [2024-08-05 07:18:38,854][00139] Sum rewards: -0.474, reward structure: {'DEATHCOUNT': '-9.750', 'HEALTH': '-0.422', 'AMMO5': '0.005', 'weapon5': '0.010', 'AMMO2': '0.017', 'ARMOR': '0.048', 'WEAPON4': '0.050', 'AMMO4': '0.084', 'WEAPON5': '0.100', 'AMMO3': '0.165', 'weapon4': '0.182', 'HITCOUNT': '0.290', 'WEAPON3': '1.000', 'DAMAGECOUNT': '1.047', 'weapon2': '1.690', 'weapon3': '2.010', 'FRAGCOUNT': '3.000'} [2024-08-05 07:18:39,101][00139] DAMAGECOUNT value on done: 70152.0 [2024-08-05 07:18:39,101][00139] Sum rewards: -3.771, reward structure: {'DEATHCOUNT': '-10.500', 'HEALTH': '-0.232', 'AMMO5': '0.012', 'AMMO2': '0.017', 'WEAPON4': '0.050', 'ARMOR': '0.064', 'weapon4': '0.068', 'AMMO4': '0.085', 'weapon5': '0.094', 'AMMO3': '0.154', 'HITCOUNT': '0.170', 'WEAPON5': '0.200', 'DAMAGECOUNT': '0.798', 'WEAPON3': '0.850', 'FRAGCOUNT': '1.000', 'weapon2': '1.588', 'weapon3': '1.810'} [2024-08-05 07:18:40,483][00034] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1110.8). Total num frames: 6529024. Throughput: 0: 282.7. Samples: 1634093. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 07:18:40,484][00034] Avg episode reward: [(0, '-1.681')] [2024-08-05 07:18:40,487][00132] Saving new best policy, reward=-1.681! [2024-08-05 07:18:45,483][00034] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1138.5). Total num frames: 6537216. Throughput: 0: 283.9. Samples: 1634949. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 07:18:45,484][00034] Avg episode reward: [(0, '-1.681')] [2024-08-05 07:18:50,483][00034] Fps is (10 sec: 1638.4, 60 sec: 1092.3, 300 sec: 1138.5). Total num frames: 6545408. Throughput: 0: 283.4. Samples: 1636656. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 07:18:50,484][00034] Avg episode reward: [(0, '-1.681')] [2024-08-05 07:18:53,700][00139] DAMAGECOUNT value on done: 67123.0 [2024-08-05 07:18:53,701][00139] Sum rewards: 0.119, reward structure: {'DEATHCOUNT': '-9.000', 'HEALTH': '-2.022', 'AMMO4': '-0.044', 'AMMO2': '-0.009', 'ARMOR': '0.004', 'AMMO5': '0.007', 'weapon5': '0.060', 'WEAPON4': '0.100', 'WEAPON5': '0.150', 'AMMO3': '0.192', 'weapon4': '0.214', 'HITCOUNT': '0.450', 'WEAPON3': '1.000', 'weapon2': '1.064', 'DAMAGECOUNT': '1.620', 'weapon3': '2.332', 'FRAGCOUNT': '4.000'} [2024-08-05 07:18:53,928][00139] DAMAGECOUNT value on done: 70447.0 [2024-08-05 07:18:53,928][00139] Sum rewards: 1.413, reward structure: {'DEATHCOUNT': '-9.750', 'HEALTH': '-2.030', 'AMMO2': '0.007', 'AMMO5': '0.018', 'AMMO4': '0.036', 'WEAPON1': '0.040', 'weapon5': '0.054', 'AMMO6': '0.100', 'WEAPON7': '0.100', 'AMMO7': '0.100', 'AMMO3': '0.135', 'HITCOUNT': '0.250', 'WEAPON5': '0.400', 'ARMOR': '0.494', 'DAMAGECOUNT': '0.885', 'WEAPON3': '0.950', 'weapon2': '1.454', 'weapon3': '2.170', 'FRAGCOUNT': '6.000'} [2024-08-05 07:18:55,360][00138] Updated weights for policy 0, policy_version 800 (0.0018) [2024-08-05 07:18:55,483][00034] Fps is (10 sec: 1638.4, 60 sec: 1228.8, 300 sec: 1138.5). Total num frames: 6553600. Throughput: 0: 283.8. Samples: 1638363. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 07:18:55,484][00034] Avg episode reward: [(0, '-1.566')] [2024-08-05 07:18:55,491][00132] Saving new best policy, reward=-1.566! [2024-08-05 07:19:00,483][00034] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1138.5). Total num frames: 6553600. Throughput: 0: 283.2. Samples: 1639194. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 07:19:00,484][00034] Avg episode reward: [(0, '-1.566')] [2024-08-05 07:19:05,483][00034] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1138.5). Total num frames: 6561792. Throughput: 0: 282.6. Samples: 1640889. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 07:19:05,485][00034] Avg episode reward: [(0, '-1.566')] [2024-08-05 07:19:08,262][00139] DAMAGECOUNT value on done: 67334.0 [2024-08-05 07:19:08,263][00139] Sum rewards: 1.805, reward structure: {'DEATHCOUNT': '-5.250', 'HEALTH': '-1.010', 'AMMO4': '-0.024', 'AMMO2': '-0.005', 'ARMOR': '0.016', 'AMMO5': '0.022', 'weapon5': '0.064', 'AMMO3': '0.076', 'HITCOUNT': '0.090', 'weapon7': '0.242', 'WEAPON5': '0.250', 'AMMO6': '0.320', 'AMMO7': '0.320', 'WEAPON7': '0.400', 'WEAPON3': '0.600', 'DAMAGECOUNT': '0.633', 'weapon3': '1.516', 'weapon2': '1.544', 'FRAGCOUNT': '2.000'} [2024-08-05 07:19:08,475][00139] DAMAGECOUNT value on done: 71008.0 [2024-08-05 07:19:08,475][00139] Sum rewards: -0.420, reward structure: {'DEATHCOUNT': '-11.250', 'HEALTH': '-1.818', 'AMMO2': '0.005', 'AMMO5': '0.017', 'WEAPON1': '0.020', 'AMMO4': '0.027', 'weapon5': '0.098', 'WEAPON4': '0.100', 'weapon4': '0.172', 'AMMO3': '0.208', 'HITCOUNT': '0.270', 'WEAPON5': '0.350', 'ARMOR': '0.487', 'WEAPON3': '0.950', 'weapon2': '1.342', 'DAMAGECOUNT': '1.683', 'weapon3': '1.918', 'FRAGCOUNT': '5.000'} [2024-08-05 07:19:10,483][00034] Fps is (10 sec: 1638.4, 60 sec: 1228.8, 300 sec: 1138.6). Total num frames: 6569984. Throughput: 0: 282.5. Samples: 1642632. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 07:19:10,484][00034] Avg episode reward: [(0, '-1.482')] [2024-08-05 07:19:10,486][00132] Saving new best policy, reward=-1.482! [2024-08-05 07:19:15,483][00034] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1138.5). Total num frames: 6569984. Throughput: 0: 282.7. Samples: 1643482. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 07:19:15,485][00034] Avg episode reward: [(0, '-1.482')] [2024-08-05 07:19:20,483][00034] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1138.5). Total num frames: 6578176. Throughput: 0: 283.5. Samples: 1645153. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 07:19:20,484][00034] Avg episode reward: [(0, '-1.482')] [2024-08-05 07:19:23,261][00139] DAMAGECOUNT value on done: 67884.0 [2024-08-05 07:19:23,262][00139] Sum rewards: 0.181, reward structure: {'DEATHCOUNT': '-12.000', 'HEALTH': '-0.442', 'AMMO2': '0.010', 'AMMO5': '0.012', 'weapon5': '0.024', 'AMMO4': '0.051', 'weapon4': '0.084', 'WEAPON4': '0.150', 'AMMO3': '0.153', 'WEAPON5': '0.250', 'ARMOR': '0.416', 'HITCOUNT': '0.450', 'WEAPON3': '0.750', 'weapon2': '1.514', 'DAMAGECOUNT': '1.650', 'weapon3': '2.108', 'FRAGCOUNT': '5.000'} [2024-08-05 07:19:23,499][00139] DAMAGECOUNT value on done: 71593.0 [2024-08-05 07:19:23,499][00139] Sum rewards: -1.848, reward structure: {'DEATHCOUNT': '-9.750', 'HEALTH': '-1.258', 'AMMO2': '0.009', 'AMMO5': '0.016', 'ARMOR': '0.040', 'AMMO4': '0.046', 'AMMO3': '0.110', 'weapon5': '0.244', 'HITCOUNT': '0.320', 'WEAPON4': '0.350', 'WEAPON5': '0.400', 'weapon4': '0.434', 'WEAPON3': '0.600', 'weapon3': '1.262', 'weapon2': '1.574', 'DAMAGECOUNT': '1.755', 'FRAGCOUNT': '2.000'} [2024-08-05 07:19:25,483][00034] Fps is (10 sec: 1638.4, 60 sec: 1228.8, 300 sec: 1138.5). Total num frames: 6586368. Throughput: 0: 283.3. Samples: 1646842. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 07:19:25,486][00034] Avg episode reward: [(0, '-1.500')] [2024-08-05 07:19:30,483][00034] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1138.5). Total num frames: 6586368. Throughput: 0: 283.5. Samples: 1647705. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 07:19:30,485][00034] Avg episode reward: [(0, '-1.500')] [2024-08-05 07:19:35,483][00034] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1138.5). Total num frames: 6594560. Throughput: 0: 282.6. Samples: 1649371. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 07:19:35,485][00034] Avg episode reward: [(0, '-1.500')] [2024-08-05 07:19:35,493][00132] Saving /kaggle/working/train_dir/default_experiment/checkpoint_p0/checkpoint_000000805_6594560.pth... [2024-08-05 07:19:35,568][00132] Removing /kaggle/working/train_dir/default_experiment/checkpoint_p0/checkpoint_000000772_6324224.pth [2024-08-05 07:19:38,015][00139] DAMAGECOUNT value on done: 68164.0 [2024-08-05 07:19:38,015][00139] Sum rewards: -7.585, reward structure: {'DEATHCOUNT': '-12.750', 'HEALTH': '-2.370', 'AMMO2': '0.001', 'AMMO4': '0.002', 'AMMO5': '0.012', 'weapon4': '0.014', 'ARMOR': '0.028', 'WEAPON4': '0.100', 'weapon5': '0.152', 'AMMO3': '0.200', 'HITCOUNT': '0.220', 'WEAPON5': '0.250', 'DAMAGECOUNT': '0.840', 'FRAGCOUNT': '1.000', 'WEAPON3': '1.100', 'weapon3': '1.592', 'weapon2': '2.024'} [2024-08-05 07:19:38,268][00139] DAMAGECOUNT value on done: 71914.0 [2024-08-05 07:19:40,483][00034] Fps is (10 sec: 1638.4, 60 sec: 1228.8, 300 sec: 1138.5). Total num frames: 6602752. Throughput: 0: 282.8. Samples: 1651087. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 07:19:40,484][00034] Avg episode reward: [(0, '-1.602')] [2024-08-05 07:19:45,483][00034] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1110.8). Total num frames: 6602752. Throughput: 0: 282.9. Samples: 1651924. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 07:19:45,484][00034] Avg episode reward: [(0, '-1.602')] [2024-08-05 07:19:50,483][00034] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1138.5). Total num frames: 6610944. Throughput: 0: 282.6. Samples: 1653606. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 07:19:50,484][00034] Avg episode reward: [(0, '-1.602')] [2024-08-05 07:19:53,149][00139] DAMAGECOUNT value on done: 68581.0 [2024-08-05 07:19:53,149][00139] Sum rewards: -5.411, reward structure: {'DEATHCOUNT': '-12.750', 'HEALTH': '-3.514', 'AMMO2': '0.017', 'AMMO5': '0.022', 'weapon5': '0.042', 'weapon7': '0.050', 'AMMO4': '0.084', 'AMMO6': '0.100', 'WEAPON7': '0.100', 'AMMO7': '0.100', 'weapon4': '0.162', 'AMMO3': '0.243', 'WEAPON4': '0.250', 'WEAPON5': '0.300', 'HITCOUNT': '0.400', 'weapon2': '1.138', 'DAMAGECOUNT': '1.251', 'WEAPON3': '1.350', 'weapon3': '2.244', 'FRAGCOUNT': '3.000'} [2024-08-05 07:19:53,433][00139] DAMAGECOUNT value on done: 72263.0 [2024-08-05 07:19:53,434][00139] Sum rewards: -0.272, reward structure: {'DEATHCOUNT': '-9.000', 'HEALTH': '-1.160', 'AMMO2': '0.007', 'weapon5': '0.008', 'AMMO5': '0.013', 'AMMO4': '0.034', 'WEAPON4': '0.050', 'weapon4': '0.076', 'WEAPON5': '0.100', 'AMMO3': '0.171', 'HITCOUNT': '0.320', 'ARMOR': '0.530', 'WEAPON3': '0.950', 'DAMAGECOUNT': '1.047', 'weapon2': '1.402', 'weapon3': '2.180', 'FRAGCOUNT': '3.000'} [2024-08-05 07:19:55,483][00034] Fps is (10 sec: 1638.4, 60 sec: 1092.3, 300 sec: 1138.5). Total num frames: 6619136. Throughput: 0: 280.8. Samples: 1655267. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 07:19:55,484][00034] Avg episode reward: [(0, '-1.610')] [2024-08-05 07:20:00,483][00034] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1110.8). Total num frames: 6619136. Throughput: 0: 280.5. Samples: 1656105. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 07:20:00,485][00034] Avg episode reward: [(0, '-1.610')] [2024-08-05 07:20:05,483][00034] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1138.5). Total num frames: 6627328. Throughput: 0: 279.7. Samples: 1657738. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 07:20:05,485][00034] Avg episode reward: [(0, '-1.610')] [2024-08-05 07:20:08,203][00138] Updated weights for policy 0, policy_version 810 (0.0017) [2024-08-05 07:20:08,379][00139] DAMAGECOUNT value on done: 69034.0 [2024-08-05 07:20:08,379][00139] Sum rewards: -5.833, reward structure: {'DEATHCOUNT': '-14.250', 'HEALTH': '-1.552', 'AMMO5': '0.018', 'AMMO2': '0.032', 'WEAPON4': '0.050', 'AMMO4': '0.160', 'weapon5': '0.206', 'AMMO3': '0.222', 'WEAPON5': '0.350', 'HITCOUNT': '0.360', 'ARMOR': '0.400', 'WEAPON3': '1.250', 'DAMAGECOUNT': '1.359', 'weapon2': '1.732', 'weapon3': '1.830', 'FRAGCOUNT': '2.000'} [2024-08-05 07:20:08,594][00139] DAMAGECOUNT value on done: 72748.0 [2024-08-05 07:20:08,595][00139] Sum rewards: -0.789, reward structure: {'DEATHCOUNT': '-9.000', 'HEALTH': '-0.390', 'AMMO2': '0.013', 'AMMO5': '0.015', 'AMMO4': '0.064', 'AMMO3': '0.156', 'weapon5': '0.214', 'HITCOUNT': '0.250', 'WEAPON5': '0.350', 'ARMOR': '0.416', 'WEAPON3': '0.850', 'DAMAGECOUNT': '1.455', 'FRAGCOUNT': '1.500', 'weapon3': '1.658', 'weapon2': '1.660'} [2024-08-05 07:20:10,483][00034] Fps is (10 sec: 1638.4, 60 sec: 1092.3, 300 sec: 1138.5). Total num frames: 6635520. Throughput: 0: 279.2. Samples: 1659408. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 07:20:10,485][00034] Avg episode reward: [(0, '-1.642')] [2024-08-05 07:20:15,483][00034] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1110.8). Total num frames: 6635520. Throughput: 0: 279.2. Samples: 1660269. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 07:20:15,485][00034] Avg episode reward: [(0, '-1.642')] [2024-08-05 07:20:20,483][00034] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1138.5). Total num frames: 6643712. Throughput: 0: 280.6. Samples: 1661998. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 07:20:20,484][00034] Avg episode reward: [(0, '-1.642')] [2024-08-05 07:20:23,133][00139] DAMAGECOUNT value on done: 69488.0 [2024-08-05 07:20:23,133][00139] Sum rewards: 6.010, reward structure: {'DEATHCOUNT': '-3.750', 'HEALTH': '-0.216', 'AMMO2': '0.020', 'weapon7': '0.052', 'AMMO3': '0.055', 'ARMOR': '0.088', 'AMMO6': '0.100', 'AMMO7': '0.100', 'WEAPON7': '0.100', 'WEAPON4': '0.100', 'AMMO4': '0.100', 'HITCOUNT': '0.270', 'weapon4': '0.272', 'WEAPON3': '0.300', 'weapon2': '0.942', 'weapon3': '1.114', 'DAMAGECOUNT': '1.362', 'FRAGCOUNT': '5.000'} [2024-08-05 07:20:23,370][00139] DAMAGECOUNT value on done: 72968.0 [2024-08-05 07:20:23,371][00139] Sum rewards: -0.042, reward structure: {'DEATHCOUNT': '-6.750', 'HEALTH': '-0.674', 'AMMO2': '0.009', 'AMMO5': '0.010', 'AMMO4': '0.044', 'WEAPON5': '0.050', 'WEAPON4': '0.100', 'AMMO3': '0.112', 'weapon4': '0.136', 'HITCOUNT': '0.170', 'DAMAGECOUNT': '0.660', 'WEAPON3': '0.700', 'ARMOR': '0.947', 'FRAGCOUNT': '1.000', 'weapon2': '1.386', 'weapon3': '2.058'} [2024-08-05 07:20:25,483][00034] Fps is (10 sec: 1638.4, 60 sec: 1092.3, 300 sec: 1138.5). Total num frames: 6651904. Throughput: 0: 280.7. Samples: 1663719. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 07:20:25,484][00034] Avg episode reward: [(0, '-1.520')] [2024-08-05 07:20:30,483][00034] Fps is (10 sec: 1638.4, 60 sec: 1228.8, 300 sec: 1138.5). Total num frames: 6660096. Throughput: 0: 281.3. Samples: 1664583. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 07:20:30,485][00034] Avg episode reward: [(0, '-1.520')] [2024-08-05 07:20:35,483][00034] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1110.8). Total num frames: 6660096. Throughput: 0: 281.3. Samples: 1666263. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 07:20:35,486][00034] Avg episode reward: [(0, '-1.520')] [2024-08-05 07:20:38,201][00139] DAMAGECOUNT value on done: 69806.0 [2024-08-05 07:20:38,202][00139] Sum rewards: 1.577, reward structure: {'DEATHCOUNT': '-6.000', 'HEALTH': '-1.457', 'weapon5': '0.002', 'AMMO2': '0.005', 'AMMO5': '0.007', 'WEAPON1': '0.010', 'AMMO4': '0.027', 'weapon4': '0.050', 'ARMOR': '0.062', 'WEAPON4': '0.100', 'weapon7': '0.112', 'AMMO3': '0.119', 'WEAPON5': '0.150', 'HITCOUNT': '0.230', 'AMMO6': '0.320', 'AMMO7': '0.320', 'WEAPON7': '0.400', 'WEAPON3': '0.700', 'DAMAGECOUNT': '0.954', 'weapon2': '1.700', 'weapon3': '1.764', 'FRAGCOUNT': '2.000'} [2024-08-05 07:20:38,438][00139] DAMAGECOUNT value on done: 73110.0 [2024-08-05 07:20:38,438][00139] Sum rewards: 3.383, reward structure: {'DEATHCOUNT': '-6.750', 'WEAPON1': '0.010', 'AMMO5': '0.015', 'AMMO2': '0.016', 'weapon4': '0.022', 'weapon5': '0.024', 'WEAPON4': '0.050', 'AMMO4': '0.080', 'AMMO3': '0.096', 'weapon7': '0.152', 'HITCOUNT': '0.160', 'WEAPON5': '0.200', 'WEAPON7': '0.200', 'AMMO6': '0.200', 'AMMO7': '0.200', 'HEALTH': '0.266', 'DAMAGECOUNT': '0.426', 'ARMOR': '0.473', 'WEAPON3': '0.500', 'weapon2': '1.388', 'weapon3': '1.654', 'FRAGCOUNT': '4.000'} [2024-08-05 07:20:40,483][00034] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1138.5). Total num frames: 6668288. Throughput: 0: 279.9. Samples: 1667861. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 07:20:40,485][00034] Avg episode reward: [(0, '-1.415')] [2024-08-05 07:20:40,487][00132] Saving new best policy, reward=-1.415! [2024-08-05 07:20:45,483][00034] Fps is (10 sec: 1638.4, 60 sec: 1228.8, 300 sec: 1138.5). Total num frames: 6676480. Throughput: 0: 279.8. Samples: 1668695. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 07:20:45,484][00034] Avg episode reward: [(0, '-1.415')] [2024-08-05 07:20:50,483][00034] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1110.8). Total num frames: 6676480. Throughput: 0: 279.6. Samples: 1670321. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 07:20:50,484][00034] Avg episode reward: [(0, '-1.415')] [2024-08-05 07:20:53,384][00139] DAMAGECOUNT value on done: 70131.0 [2024-08-05 07:20:53,384][00139] Sum rewards: -3.006, reward structure: {'DEATHCOUNT': '-9.750', 'HEALTH': '-1.504', 'AMMO2': '0.009', 'AMMO5': '0.028', 'weapon4': '0.042', 'AMMO4': '0.045', 'WEAPON4': '0.050', 'ARMOR': '0.068', 'AMMO3': '0.083', 'weapon5': '0.132', 'HITCOUNT': '0.200', 'WEAPON5': '0.350', 'WEAPON3': '0.600', 'DAMAGECOUNT': '0.975', 'weapon3': '1.730', 'weapon2': '1.936', 'FRAGCOUNT': '2.000'} [2024-08-05 07:20:53,601][00139] DAMAGECOUNT value on done: 73495.0 [2024-08-05 07:20:53,602][00139] Sum rewards: -2.648, reward structure: {'DEATHCOUNT': '-9.000', 'HEALTH': '-1.313', 'AMMO5': '0.007', 'ARMOR': '0.016', 'AMMO2': '0.023', 'weapon7': '0.048', 'AMMO6': '0.100', 'AMMO7': '0.100', 'WEAPON7': '0.100', 'AMMO4': '0.117', 'AMMO3': '0.118', 'WEAPON5': '0.150', 'WEAPON4': '0.150', 'weapon4': '0.226', 'HITCOUNT': '0.270', 'WEAPON3': '0.800', 'FRAGCOUNT': '1.000', 'DAMAGECOUNT': '1.155', 'weapon3': '1.540', 'weapon2': '1.744'} [2024-08-05 07:20:55,483][00034] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1138.5). Total num frames: 6684672. Throughput: 0: 280.8. Samples: 1672042. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 07:20:55,485][00034] Avg episode reward: [(0, '-1.450')] [2024-08-05 07:21:00,483][00034] Fps is (10 sec: 1638.4, 60 sec: 1228.8, 300 sec: 1138.5). Total num frames: 6692864. Throughput: 0: 280.3. Samples: 1672883. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 07:21:00,485][00034] Avg episode reward: [(0, '-1.450')] [2024-08-05 07:21:05,483][00034] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1110.8). Total num frames: 6692864. Throughput: 0: 280.0. Samples: 1674600. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 07:21:05,484][00034] Avg episode reward: [(0, '-1.450')] [2024-08-05 07:21:08,334][00139] DAMAGECOUNT value on done: 71218.0 [2024-08-05 07:21:08,335][00139] Sum rewards: 2.843, reward structure: {'DEATHCOUNT': '-10.500', 'HEALTH': '-1.493', 'AMMO4': '-0.052', 'AMMO2': '-0.010', 'AMMO5': '0.007', 'ARMOR': '0.065', 'weapon5': '0.138', 'AMMO3': '0.151', 'WEAPON5': '0.200', 'HITCOUNT': '0.400', 'WEAPON3': '0.750', 'weapon3': '1.442', 'weapon2': '1.490', 'DAMAGECOUNT': '3.255', 'FRAGCOUNT': '7.000'} [2024-08-05 07:21:08,569][00139] DAMAGECOUNT value on done: 73829.0 [2024-08-05 07:21:08,570][00139] Sum rewards: -0.232, reward structure: {'DEATHCOUNT': '-8.250', 'HEALTH': '-0.560', 'AMMO5': '0.012', 'AMMO2': '0.014', 'weapon4': '0.026', 'WEAPON4': '0.050', 'weapon7': '0.064', 'AMMO4': '0.068', 'AMMO6': '0.100', 'AMMO7': '0.100', 'WEAPON7': '0.100', 'AMMO3': '0.132', 'weapon5': '0.144', 'HITCOUNT': '0.160', 'WEAPON5': '0.200', 'ARMOR': '0.400', 'WEAPON3': '0.750', 'DAMAGECOUNT': '1.002', 'weapon3': '1.546', 'weapon2': '1.710', 'FRAGCOUNT': '2.000'} [2024-08-05 07:21:10,483][00034] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1138.5). Total num frames: 6701056. Throughput: 0: 277.6. Samples: 1676213. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 07:21:10,484][00034] Avg episode reward: [(0, '-1.372')] [2024-08-05 07:21:10,487][00132] Saving new best policy, reward=-1.372! [2024-08-05 07:21:15,483][00034] Fps is (10 sec: 1638.4, 60 sec: 1228.8, 300 sec: 1138.5). Total num frames: 6709248. Throughput: 0: 277.6. Samples: 1677073. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 07:21:15,485][00034] Avg episode reward: [(0, '-1.372')] [2024-08-05 07:21:20,483][00034] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1110.8). Total num frames: 6709248. Throughput: 0: 277.5. Samples: 1678749. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 07:21:20,484][00034] Avg episode reward: [(0, '-1.372')] [2024-08-05 07:21:21,378][00138] Updated weights for policy 0, policy_version 820 (0.0017) [2024-08-05 07:21:23,403][00139] DAMAGECOUNT value on done: 71878.0 [2024-08-05 07:21:23,403][00139] Sum rewards: 0.232, reward structure: {'DEATHCOUNT': '-12.750', 'HEALTH': '-1.920', 'ARMOR': '0.008', 'AMMO2': '0.009', 'AMMO5': '0.010', 'AMMO4': '0.044', 'WEAPON5': '0.100', 'AMMO3': '0.185', 'HITCOUNT': '0.540', 'WEAPON3': '1.150', 'weapon3': '1.916', 'weapon2': '1.960', 'DAMAGECOUNT': '1.980', 'FRAGCOUNT': '7.000'} [2024-08-05 07:21:23,636][00139] DAMAGECOUNT value on done: 74242.0 [2024-08-05 07:21:23,637][00139] Sum rewards: 0.015, reward structure: {'DEATHCOUNT': '-9.000', 'HEALTH': '-1.826', 'weapon4': '0.002', 'AMMO2': '0.013', 'WEAPON1': '0.020', 'AMMO5': '0.020', 'ARMOR': '0.028', 'WEAPON4': '0.050', 'AMMO4': '0.064', 'AMMO3': '0.132', 'weapon5': '0.170', 'WEAPON5': '0.300', 'HITCOUNT': '0.340', 'WEAPON3': '0.750', 'DAMAGECOUNT': '1.239', 'weapon3': '1.550', 'weapon2': '2.162', 'FRAGCOUNT': '4.000'} [2024-08-05 07:21:25,483][00034] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1138.5). Total num frames: 6717440. Throughput: 0: 279.8. Samples: 1680450. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 07:21:25,484][00034] Avg episode reward: [(0, '-1.342')] [2024-08-05 07:21:25,491][00132] Saving new best policy, reward=-1.342! [2024-08-05 07:21:30,483][00034] Fps is (10 sec: 1638.4, 60 sec: 1092.3, 300 sec: 1138.5). Total num frames: 6725632. Throughput: 0: 280.3. Samples: 1681310. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 07:21:30,484][00034] Avg episode reward: [(0, '-1.342')] [2024-08-05 07:21:35,483][00034] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1110.8). Total num frames: 6725632. Throughput: 0: 281.8. Samples: 1683003. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 07:21:35,484][00034] Avg episode reward: [(0, '-1.342')] [2024-08-05 07:21:35,493][00132] Saving /kaggle/working/train_dir/default_experiment/checkpoint_p0/checkpoint_000000821_6725632.pth... [2024-08-05 07:21:35,578][00132] Removing /kaggle/working/train_dir/default_experiment/checkpoint_p0/checkpoint_000000788_6455296.pth [2024-08-05 07:21:38,250][00139] DAMAGECOUNT value on done: 72084.0 [2024-08-05 07:21:38,250][00139] Sum rewards: -1.089, reward structure: {'DEATHCOUNT': '-9.000', 'AMMO4': '-0.048', 'AMMO2': '-0.010', 'AMMO5': '0.028', 'AMMO3': '0.125', 'HEALTH': '0.197', 'HITCOUNT': '0.200', 'weapon5': '0.226', 'FRAGCOUNT': '0.500', 'WEAPON5': '0.500', 'DAMAGECOUNT': '0.618', 'WEAPON3': '0.750', 'ARMOR': '1.321', 'weapon2': '1.636', 'weapon3': '1.868'} [2024-08-05 07:21:38,528][00139] DAMAGECOUNT value on done: 74372.0 [2024-08-05 07:21:40,483][00034] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1138.5). Total num frames: 6733824. Throughput: 0: 279.6. Samples: 1684624. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 07:21:40,485][00034] Avg episode reward: [(0, '-1.334')] [2024-08-05 07:21:40,487][00132] Saving new best policy, reward=-1.334! [2024-08-05 07:21:45,483][00034] Fps is (10 sec: 1638.4, 60 sec: 1092.3, 300 sec: 1138.5). Total num frames: 6742016. Throughput: 0: 279.7. Samples: 1685471. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 07:21:45,484][00034] Avg episode reward: [(0, '-1.334')] [2024-08-05 07:21:50,483][00034] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1110.8). Total num frames: 6742016. Throughput: 0: 278.6. Samples: 1687139. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 07:21:50,484][00034] Avg episode reward: [(0, '-1.334')] [2024-08-05 07:21:53,334][00139] DAMAGECOUNT value on done: 72428.0 [2024-08-05 07:21:53,334][00139] Sum rewards: -1.446, reward structure: {'DEATHCOUNT': '-10.500', 'HEALTH': '-0.770', 'ARMOR': '0.008', 'WEAPON1': '0.010', 'AMMO2': '0.016', 'AMMO5': '0.028', 'weapon4': '0.036', 'WEAPON4': '0.050', 'weapon5': '0.064', 'AMMO4': '0.082', 'AMMO3': '0.156', 'HITCOUNT': '0.320', 'WEAPON5': '0.600', 'WEAPON3': '0.800', 'DAMAGECOUNT': '1.032', 'weapon2': '1.532', 'weapon3': '2.090', 'FRAGCOUNT': '3.000'} [2024-08-05 07:21:53,562][00139] DAMAGECOUNT value on done: 74822.0 [2024-08-05 07:21:53,563][00139] Sum rewards: -9.240, reward structure: {'DEATHCOUNT': '-14.250', 'HEALTH': '-3.502', 'ARMOR': '0.016', 'WEAPON1': '0.020', 'AMMO5': '0.022', 'AMMO2': '0.023', 'AMMO4': '0.114', 'weapon5': '0.156', 'AMMO3': '0.193', 'HITCOUNT': '0.250', 'WEAPON4': '0.300', 'WEAPON5': '0.350', 'weapon4': '0.372', 'FRAGCOUNT': '1.000', 'WEAPON3': '1.150', 'DAMAGECOUNT': '1.350', 'weapon2': '1.526', 'weapon3': '1.670'} [2024-08-05 07:21:55,483][00034] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1110.8). Total num frames: 6750208. Throughput: 0: 280.8. Samples: 1688848. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 07:21:55,485][00034] Avg episode reward: [(0, '-1.467')] [2024-08-05 07:22:00,483][00034] Fps is (10 sec: 1638.4, 60 sec: 1092.3, 300 sec: 1138.5). Total num frames: 6758400. Throughput: 0: 279.8. Samples: 1689665. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 07:22:00,484][00034] Avg episode reward: [(0, '-1.467')] [2024-08-05 07:22:05,483][00034] Fps is (10 sec: 1638.4, 60 sec: 1228.8, 300 sec: 1138.5). Total num frames: 6766592. Throughput: 0: 279.7. Samples: 1691337. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 07:22:05,485][00034] Avg episode reward: [(0, '-1.467')] [2024-08-05 07:22:08,280][00139] DAMAGECOUNT value on done: 72646.0 [2024-08-05 07:22:08,281][00139] Sum rewards: 0.215, reward structure: {'DEATHCOUNT': '-6.000', 'AMMO5': '0.003', 'AMMO2': '0.003', 'AMMO4': '0.017', 'WEAPON5': '0.050', 'AMMO3': '0.090', 'HITCOUNT': '0.160', 'WEAPON3': '0.300', 'ARMOR': '0.456', 'HEALTH': '0.516', 'DAMAGECOUNT': '0.654', 'FRAGCOUNT': '1.000', 'weapon3': '1.200', 'weapon2': '1.766'} [2024-08-05 07:22:08,519][00139] DAMAGECOUNT value on done: 75177.0 [2024-08-05 07:22:08,519][00139] Sum rewards: -0.083, reward structure: {'DEATHCOUNT': '-6.750', 'HEALTH': '-0.237', 'AMMO5': '0.003', 'AMMO2': '0.007', 'weapon4': '0.010', 'weapon5': '0.010', 'AMMO4': '0.034', 'ARMOR': '0.040', 'weapon7': '0.042', 'WEAPON4': '0.050', 'WEAPON5': '0.050', 'AMMO6': '0.100', 'WEAPON7': '0.100', 'AMMO7': '0.100', 'AMMO3': '0.128', 'HITCOUNT': '0.260', 'WEAPON3': '0.600', 'DAMAGECOUNT': '0.975', 'FRAGCOUNT': '1.000', 'weapon2': '1.498', 'weapon3': '1.898'} [2024-08-05 07:22:10,483][00034] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1110.8). Total num frames: 6766592. Throughput: 0: 279.6. Samples: 1693032. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 07:22:10,484][00034] Avg episode reward: [(0, '-1.421')] [2024-08-05 07:22:15,483][00034] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1138.5). Total num frames: 6774784. Throughput: 0: 279.2. Samples: 1693873. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 07:22:15,485][00034] Avg episode reward: [(0, '-1.421')] [2024-08-05 07:22:20,483][00034] Fps is (10 sec: 1638.4, 60 sec: 1228.8, 300 sec: 1138.5). Total num frames: 6782976. Throughput: 0: 279.0. Samples: 1695560. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 07:22:20,485][00034] Avg episode reward: [(0, '-1.421')] [2024-08-05 07:22:23,319][00139] DAMAGECOUNT value on done: 73192.0 [2024-08-05 07:22:23,320][00139] Sum rewards: 2.513, reward structure: {'DEATHCOUNT': '-8.250', 'HEALTH': '-0.776', 'ARMOR': '0.004', 'AMMO2': '0.015', 'AMMO5': '0.018', 'weapon5': '0.046', 'weapon7': '0.052', 'AMMO4': '0.075', 'AMMO3': '0.085', 'AMMO6': '0.100', 'AMMO7': '0.100', 'WEAPON7': '0.100', 'WEAPON4': '0.100', 'weapon4': '0.196', 'HITCOUNT': '0.250', 'WEAPON5': '0.300', 'WEAPON3': '0.500', 'weapon3': '1.176', 'DAMAGECOUNT': '1.638', 'weapon2': '1.784', 'FRAGCOUNT': '5.000'} [2024-08-05 07:22:23,574][00139] DAMAGECOUNT value on done: 75805.0 [2024-08-05 07:22:23,574][00139] Sum rewards: -1.458, reward structure: {'DEATHCOUNT': '-10.500', 'HEALTH': '-2.170', 'AMMO2': '0.021', 'AMMO5': '0.023', 'AMMO4': '0.103', 'weapon5': '0.136', 'AMMO3': '0.157', 'WEAPON5': '0.350', 'HITCOUNT': '0.420', 'WEAPON3': '0.950', 'weapon2': '1.526', 'DAMAGECOUNT': '1.884', 'weapon3': '2.142', 'FRAGCOUNT': '3.500'} [2024-08-05 07:22:25,483][00034] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1110.8). Total num frames: 6782976. Throughput: 0: 280.4. Samples: 1697243. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 07:22:25,484][00034] Avg episode reward: [(0, '-1.299')] [2024-08-05 07:22:25,491][00132] Saving new best policy, reward=-1.299! [2024-08-05 07:22:30,483][00034] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1138.5). Total num frames: 6791168. Throughput: 0: 280.0. Samples: 1698071. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 07:22:30,484][00034] Avg episode reward: [(0, '-1.299')] [2024-08-05 07:22:34,446][00138] Updated weights for policy 0, policy_version 830 (0.0017) [2024-08-05 07:22:35,483][00034] Fps is (10 sec: 1638.4, 60 sec: 1228.8, 300 sec: 1138.5). Total num frames: 6799360. Throughput: 0: 280.8. Samples: 1699777. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 07:22:35,484][00034] Avg episode reward: [(0, '-1.299')] [2024-08-05 07:22:38,253][00139] DAMAGECOUNT value on done: 73912.0 [2024-08-05 07:22:38,253][00139] Sum rewards: 4.770, reward structure: {'DEATHCOUNT': '-6.000', 'AMMO2': '0.010', 'AMMO5': '0.010', 'AMMO4': '0.048', 'weapon7': '0.052', 'AMMO3': '0.080', 'AMMO6': '0.120', 'AMMO7': '0.120', 'HEALTH': '0.156', 'WEAPON5': '0.200', 'WEAPON7': '0.200', 'weapon5': '0.224', 'HITCOUNT': '0.270', 'WEAPON3': '0.350', 'ARMOR': '0.432', 'weapon3': '1.352', 'weapon2': '1.986', 'DAMAGECOUNT': '2.160', 'FRAGCOUNT': '3.000'} [2024-08-05 07:22:38,483][00139] DAMAGECOUNT value on done: 76010.0 [2024-08-05 07:22:38,483][00139] Sum rewards: -4.622, reward structure: {'DEATHCOUNT': '-9.750', 'HEALTH': '-1.601', 'AMMO2': '0.007', 'AMMO5': '0.008', 'AMMO4': '0.036', 'WEAPON4': '0.050', 'weapon5': '0.068', 'ARMOR': '0.076', 'weapon7': '0.090', 'AMMO6': '0.100', 'AMMO7': '0.100', 'WEAPON7': '0.100', 'AMMO3': '0.106', 'HITCOUNT': '0.120', 'WEAPON5': '0.200', 'WEAPON3': '0.600', 'DAMAGECOUNT': '0.615', 'FRAGCOUNT': '1.000', 'weapon3': '1.186', 'weapon2': '2.266'} [2024-08-05 07:22:40,483][00034] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1110.8). Total num frames: 6799360. Throughput: 0: 279.8. Samples: 1701441. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 07:22:40,485][00034] Avg episode reward: [(0, '-1.221')] [2024-08-05 07:22:40,487][00132] Saving new best policy, reward=-1.221! [2024-08-05 07:22:45,483][00034] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1110.8). Total num frames: 6807552. Throughput: 0: 278.8. Samples: 1702213. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 07:22:45,485][00034] Avg episode reward: [(0, '-1.221')] [2024-08-05 07:22:50,483][00034] Fps is (10 sec: 1638.4, 60 sec: 1228.8, 300 sec: 1138.5). Total num frames: 6815744. Throughput: 0: 280.1. Samples: 1703943. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 07:22:50,485][00034] Avg episode reward: [(0, '-1.221')] [2024-08-05 07:22:53,367][00139] DAMAGECOUNT value on done: 74388.0 [2024-08-05 07:22:53,367][00139] Sum rewards: -3.472, reward structure: {'DEATHCOUNT': '-10.500', 'HEALTH': '-1.500', 'AMMO5': '0.012', 'AMMO2': '0.014', 'ARMOR': '0.068', 'AMMO4': '0.069', 'AMMO3': '0.146', 'weapon5': '0.228', 'HITCOUNT': '0.300', 'WEAPON5': '0.300', 'WEAPON3': '0.950', 'DAMAGECOUNT': '1.428', 'FRAGCOUNT': '1.500', 'weapon2': '1.598', 'weapon3': '1.914'} [2024-08-05 07:22:53,586][00139] DAMAGECOUNT value on done: 76387.0 [2024-08-05 07:22:53,586][00139] Sum rewards: -2.041, reward structure: {'DEATHCOUNT': '-9.750', 'HEALTH': '-2.664', 'AMMO4': '-0.002', 'AMMO2': '-0.000', 'AMMO5': '0.007', 'ARMOR': '0.032', 'weapon5': '0.082', 'WEAPON5': '0.100', 'AMMO3': '0.121', 'HITCOUNT': '0.330', 'WEAPON3': '0.800', 'DAMAGECOUNT': '1.131', 'weapon3': '1.842', 'weapon2': '1.930', 'FRAGCOUNT': '4.000'} [2024-08-05 07:22:55,483][00034] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1110.8). Total num frames: 6815744. Throughput: 0: 279.8. Samples: 1705625. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 07:22:55,485][00034] Avg episode reward: [(0, '-1.272')] [2024-08-05 07:23:00,483][00034] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1110.8). Total num frames: 6823936. Throughput: 0: 280.1. Samples: 1706476. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 07:23:00,485][00034] Avg episode reward: [(0, '-1.272')] [2024-08-05 07:23:05,483][00034] Fps is (10 sec: 1638.4, 60 sec: 1092.3, 300 sec: 1138.5). Total num frames: 6832128. Throughput: 0: 281.1. Samples: 1708209. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 07:23:05,485][00034] Avg episode reward: [(0, '-1.272')] [2024-08-05 07:23:08,200][00139] DAMAGECOUNT value on done: 75181.0 [2024-08-05 07:23:08,201][00139] Sum rewards: 3.085, reward structure: {'DEATHCOUNT': '-10.500', 'HEALTH': '-0.413', 'AMMO5': '0.017', 'AMMO2': '0.027', 'ARMOR': '0.028', 'AMMO4': '0.133', 'AMMO3': '0.140', 'WEAPON4': '0.150', 'weapon5': '0.196', 'WEAPON5': '0.250', 'HITCOUNT': '0.330', 'weapon4': '0.444', 'WEAPON3': '0.700', 'weapon3': '1.508', 'weapon2': '1.696', 'DAMAGECOUNT': '2.379', 'FRAGCOUNT': '6.000'} [2024-08-05 07:23:08,444][00139] DAMAGECOUNT value on done: 77057.0 [2024-08-05 07:23:08,444][00139] Sum rewards: 3.328, reward structure: {'DEATHCOUNT': '-9.000', 'HEALTH': '-0.267', 'AMMO5': '0.012', 'AMMO2': '0.026', 'weapon4': '0.050', 'weapon7': '0.056', 'AMMO6': '0.100', 'AMMO7': '0.100', 'WEAPON4': '0.100', 'WEAPON7': '0.100', 'AMMO3': '0.104', 'weapon5': '0.106', 'AMMO4': '0.127', 'WEAPON5': '0.200', 'HITCOUNT': '0.340', 'WEAPON3': '0.600', 'weapon3': '1.746', 'weapon2': '1.818', 'DAMAGECOUNT': '2.010', 'FRAGCOUNT': '5.000'} [2024-08-05 07:23:10,483][00034] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1110.8). Total num frames: 6832128. Throughput: 0: 280.2. Samples: 1709850. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 07:23:10,484][00034] Avg episode reward: [(0, '-1.159')] [2024-08-05 07:23:10,486][00132] Saving new best policy, reward=-1.159! [2024-08-05 07:23:15,483][00034] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1110.8). Total num frames: 6840320. Throughput: 0: 280.4. Samples: 1710691. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 07:23:15,485][00034] Avg episode reward: [(0, '-1.159')] [2024-08-05 07:23:20,483][00034] Fps is (10 sec: 1638.4, 60 sec: 1092.3, 300 sec: 1138.5). Total num frames: 6848512. Throughput: 0: 278.5. Samples: 1712310. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 07:23:20,485][00034] Avg episode reward: [(0, '-1.159')] [2024-08-05 07:23:23,312][00139] DAMAGECOUNT value on done: 75346.0 [2024-08-05 07:23:23,312][00139] Sum rewards: -5.119, reward structure: {'DEATHCOUNT': '-9.000', 'HEALTH': '-1.682', 'FRAGCOUNT': '-0.500', 'AMMO4': '-0.025', 'AMMO2': '-0.005', 'AMMO5': '0.005', 'weapon5': '0.020', 'WEAPON4': '0.050', 'WEAPON5': '0.100', 'HITCOUNT': '0.110', 'AMMO3': '0.164', 'weapon4': '0.172', 'ARMOR': '0.492', 'DAMAGECOUNT': '0.495', 'WEAPON3': '1.050', 'weapon2': '1.644', 'weapon3': '1.790'} [2024-08-05 07:23:23,549][00139] DAMAGECOUNT value on done: 77187.0 [2024-08-05 07:23:23,550][00139] Sum rewards: -5.691, reward structure: {'DEATHCOUNT': '-9.750', 'HEALTH': '-0.950', 'FRAGCOUNT': '-0.500', 'AMMO5': '0.020', 'ARMOR': '0.020', 'AMMO2': '0.024', 'WEAPON4': '0.050', 'weapon5': '0.090', 'AMMO4': '0.118', 'weapon4': '0.130', 'AMMO3': '0.131', 'HITCOUNT': '0.140', 'WEAPON5': '0.250', 'DAMAGECOUNT': '0.390', 'WEAPON3': '0.800', 'weapon2': '1.502', 'weapon3': '1.844'} [2024-08-05 07:23:25,483][00034] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1110.8). Total num frames: 6848512. Throughput: 0: 280.8. Samples: 1714076. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 07:23:25,484][00034] Avg episode reward: [(0, '-1.198')] [2024-08-05 07:23:30,483][00034] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1110.8). Total num frames: 6856704. Throughput: 0: 281.8. Samples: 1714892. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 07:23:30,484][00034] Avg episode reward: [(0, '-1.198')] [2024-08-05 07:23:35,483][00034] Fps is (10 sec: 1638.4, 60 sec: 1092.3, 300 sec: 1138.5). Total num frames: 6864896. Throughput: 0: 280.9. Samples: 1716584. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 07:23:35,484][00034] Avg episode reward: [(0, '-1.198')] [2024-08-05 07:23:35,494][00132] Saving /kaggle/working/train_dir/default_experiment/checkpoint_p0/checkpoint_000000838_6864896.pth... [2024-08-05 07:23:35,567][00132] Removing /kaggle/working/train_dir/default_experiment/checkpoint_p0/checkpoint_000000805_6594560.pth [2024-08-05 07:23:36,621][00139] Large shaping reward -2.519 for [('FRAGCOUNT', -1.5, -1.0), ('DEATHCOUNT', -0.75, 1.0), ('HEALTH', -0.27, -90.0), ('AMMO5', -0.0005, -1.0), ('weapon5', 0.002)] [2024-08-05 07:23:38,107][00139] DAMAGECOUNT value on done: 75729.0 [2024-08-05 07:23:38,107][00139] Sum rewards: 2.825, reward structure: {'DEATHCOUNT': '-6.000', 'HEALTH': '-0.707', 'AMMO2': '0.012', 'AMMO4': '0.062', 'AMMO3': '0.083', 'weapon7': '0.098', 'AMMO6': '0.100', 'AMMO7': '0.100', 'WEAPON7': '0.100', 'WEAPON4': '0.150', 'weapon4': '0.174', 'HITCOUNT': '0.200', 'WEAPON3': '0.450', 'ARMOR': '0.516', 'weapon2': '1.046', 'DAMAGECOUNT': '1.149', 'weapon3': '1.292', 'FRAGCOUNT': '4.000'} [2024-08-05 07:23:38,346][00139] DAMAGECOUNT value on done: 77631.0 [2024-08-05 07:23:38,346][00139] Sum rewards: -6.737, reward structure: {'DEATHCOUNT': '-12.000', 'HEALTH': '-2.350', 'AMMO2': '0.003', 'AMMO5': '0.008', 'AMMO4': '0.017', 'weapon5': '0.038', 'ARMOR': '0.064', 'WEAPON5': '0.100', 'AMMO3': '0.143', 'WEAPON4': '0.150', 'weapon4': '0.170', 'HITCOUNT': '0.380', 'FRAGCOUNT': '0.500', 'WEAPON3': '1.050', 'DAMAGECOUNT': '1.332', 'weapon2': '1.674', 'weapon3': '1.982'} [2024-08-05 07:23:40,483][00034] Fps is (10 sec: 1638.4, 60 sec: 1228.8, 300 sec: 1138.5). Total num frames: 6873088. Throughput: 0: 281.3. Samples: 1718283. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 07:23:40,484][00034] Avg episode reward: [(0, '-1.259')] [2024-08-05 07:23:45,483][00034] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1110.8). Total num frames: 6873088. Throughput: 0: 281.6. Samples: 1719150. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 07:23:45,486][00034] Avg episode reward: [(0, '-1.259')] [2024-08-05 07:23:47,488][00138] Updated weights for policy 0, policy_version 840 (0.0018) [2024-08-05 07:23:50,483][00034] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1110.8). Total num frames: 6881280. Throughput: 0: 279.6. Samples: 1720790. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 07:23:50,484][00034] Avg episode reward: [(0, '-1.259')] [2024-08-05 07:23:53,037][00139] DAMAGECOUNT value on done: 76059.0 [2024-08-05 07:23:53,038][00139] Sum rewards: -1.182, reward structure: {'DEATHCOUNT': '-8.250', 'HEALTH': '-1.408', 'weapon4': '0.002', 'AMMO2': '0.006', 'AMMO5': '0.009', 'AMMO4': '0.030', 'WEAPON4': '0.100', 'AMMO3': '0.109', 'weapon5': '0.132', 'HITCOUNT': '0.200', 'WEAPON5': '0.250', 'WEAPON3': '0.700', 'ARMOR': '0.848', 'DAMAGECOUNT': '0.990', 'FRAGCOUNT': '1.500', 'weapon3': '1.556', 'weapon2': '2.044'} [2024-08-05 07:23:53,274][00139] DAMAGECOUNT value on done: 78001.0 [2024-08-05 07:23:53,274][00139] Sum rewards: 0.559, reward structure: {'DEATHCOUNT': '-7.500', 'HEALTH': '-0.010', 'AMMO5': '0.005', 'AMMO2': '0.011', 'weapon5': '0.030', 'weapon7': '0.048', 'WEAPON5': '0.050', 'WEAPON4': '0.050', 'AMMO4': '0.056', 'AMMO3': '0.070', 'AMMO6': '0.120', 'AMMO7': '0.120', 'weapon4': '0.150', 'WEAPON7': '0.200', 'HITCOUNT': '0.290', 'WEAPON3': '0.450', 'DAMAGECOUNT': '1.110', 'weapon3': '1.494', 'weapon2': '1.814', 'FRAGCOUNT': '2.000'} [2024-08-05 07:23:55,483][00034] Fps is (10 sec: 1638.4, 60 sec: 1228.8, 300 sec: 1138.5). Total num frames: 6889472. Throughput: 0: 280.9. Samples: 1722492. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 07:23:55,485][00034] Avg episode reward: [(0, '-1.233')] [2024-08-05 07:24:00,483][00034] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1110.8). Total num frames: 6889472. Throughput: 0: 281.6. Samples: 1723364. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 07:24:00,484][00034] Avg episode reward: [(0, '-1.233')] [2024-08-05 07:24:05,483][00034] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1110.8). Total num frames: 6897664. Throughput: 0: 282.6. Samples: 1725027. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 07:24:05,485][00034] Avg episode reward: [(0, '-1.233')] [2024-08-05 07:24:07,967][00139] DAMAGECOUNT value on done: 76229.0 [2024-08-05 07:24:07,968][00139] Sum rewards: 0.926, reward structure: {'DEATHCOUNT': '-4.500', 'HEALTH': '-0.800', 'AMMO4': '-0.028', 'AMMO2': '-0.005', 'AMMO5': '0.003', 'AMMO3': '0.050', 'WEAPON4': '0.050', 'WEAPON5': '0.050', 'ARMOR': '0.100', 'weapon5': '0.118', 'HITCOUNT': '0.120', 'weapon4': '0.132', 'weapon7': '0.180', 'AMMO6': '0.220', 'AMMO7': '0.220', 'WEAPON7': '0.300', 'WEAPON3': '0.450', 'DAMAGECOUNT': '0.510', 'FRAGCOUNT': '1.000', 'weapon3': '1.222', 'weapon2': '1.534'} [2024-08-05 07:24:08,205][00139] DAMAGECOUNT value on done: 78395.0 [2024-08-05 07:24:08,206][00139] Sum rewards: 0.149, reward structure: {'DEATHCOUNT': '-7.500', 'HEALTH': '-1.050', 'AMMO2': '0.012', 'AMMO5': '0.017', 'WEAPON1': '0.020', 'weapon5': '0.034', 'AMMO4': '0.057', 'ARMOR': '0.061', 'AMMO3': '0.130', 'WEAPON5': '0.350', 'HITCOUNT': '0.370', 'WEAPON3': '0.850', 'DAMAGECOUNT': '1.182', 'weapon2': '1.608', 'FRAGCOUNT': '2.000', 'weapon3': '2.008'} [2024-08-05 07:24:10,483][00034] Fps is (10 sec: 1638.4, 60 sec: 1228.8, 300 sec: 1138.5). Total num frames: 6905856. Throughput: 0: 280.6. Samples: 1726702. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 07:24:10,485][00034] Avg episode reward: [(0, '-1.183')] [2024-08-05 07:24:15,483][00034] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1110.8). Total num frames: 6905856. Throughput: 0: 281.6. Samples: 1727564. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 07:24:15,485][00034] Avg episode reward: [(0, '-1.183')] [2024-08-05 07:24:20,483][00034] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1110.8). Total num frames: 6914048. Throughput: 0: 280.1. Samples: 1729187. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 07:24:20,485][00034] Avg episode reward: [(0, '-1.183')] [2024-08-05 07:24:23,218][00139] DAMAGECOUNT value on done: 76474.0 [2024-08-05 07:24:23,219][00139] Sum rewards: 0.270, reward structure: {'DEATHCOUNT': '-7.500', 'HEALTH': '-0.544', 'AMMO5': '0.003', 'AMMO2': '0.008', 'weapon5': '0.010', 'weapon4': '0.024', 'AMMO4': '0.041', 'ARMOR': '0.044', 'WEAPON5': '0.050', 'weapon7': '0.054', 'WEAPON4': '0.100', 'AMMO6': '0.120', 'AMMO7': '0.120', 'AMMO3': '0.129', 'HITCOUNT': '0.180', 'WEAPON7': '0.200', 'DAMAGECOUNT': '0.735', 'WEAPON3': '0.750', 'weapon2': '1.856', 'weapon3': '1.890', 'FRAGCOUNT': '2.000'} [2024-08-05 07:24:23,449][00139] DAMAGECOUNT value on done: 79068.0 [2024-08-05 07:24:23,450][00139] Sum rewards: -3.932, reward structure: {'DEATHCOUNT': '-10.500', 'HEALTH': '-2.836', 'AMMO4': '-0.035', 'AMMO2': '-0.007', 'AMMO5': '0.025', 'ARMOR': '0.048', 'weapon5': '0.106', 'AMMO3': '0.145', 'WEAPON5': '0.450', 'HITCOUNT': '0.460', 'WEAPON3': '1.050', 'weapon2': '1.444', 'FRAGCOUNT': '1.500', 'DAMAGECOUNT': '2.019', 'weapon3': '2.200'} [2024-08-05 07:24:25,483][00034] Fps is (10 sec: 1638.4, 60 sec: 1228.8, 300 sec: 1138.5). Total num frames: 6922240. Throughput: 0: 279.5. Samples: 1730862. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 07:24:25,485][00034] Avg episode reward: [(0, '-1.204')] [2024-08-05 07:24:30,483][00034] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1110.8). Total num frames: 6922240. Throughput: 0: 279.4. Samples: 1731722. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 07:24:30,484][00034] Avg episode reward: [(0, '-1.204')] [2024-08-05 07:24:35,483][00034] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1110.8). Total num frames: 6930432. Throughput: 0: 282.0. Samples: 1733481. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 07:24:35,485][00034] Avg episode reward: [(0, '-1.204')] [2024-08-05 07:24:37,769][00139] DAMAGECOUNT value on done: 76792.0 [2024-08-05 07:24:37,770][00139] Sum rewards: -6.182, reward structure: {'DEATHCOUNT': '-12.000', 'HEALTH': '-1.432', 'AMMO2': '0.008', 'WEAPON1': '0.010', 'weapon4': '0.012', 'AMMO5': '0.017', 'AMMO4': '0.042', 'weapon5': '0.060', 'ARMOR': '0.076', 'WEAPON4': '0.100', 'AMMO3': '0.186', 'HITCOUNT': '0.310', 'WEAPON5': '0.350', 'FRAGCOUNT': '0.500', 'DAMAGECOUNT': '0.954', 'WEAPON3': '1.050', 'weapon2': '1.290', 'weapon3': '2.284'} [2024-08-05 07:24:38,011][00139] DAMAGECOUNT value on done: 79714.0 [2024-08-05 07:24:38,011][00139] Sum rewards: 4.723, reward structure: {'DEATHCOUNT': '-10.500', 'HEALTH': '-1.341', 'AMMO5': '0.003', 'AMMO2': '0.014', 'ARMOR': '0.040', 'AMMO4': '0.068', 'weapon5': '0.090', 'WEAPON5': '0.100', 'AMMO3': '0.168', 'HITCOUNT': '0.470', 'WEAPON3': '0.900', 'weapon3': '1.824', 'DAMAGECOUNT': '1.938', 'weapon2': '1.950', 'FRAGCOUNT': '9.000'} [2024-08-05 07:24:40,483][00034] Fps is (10 sec: 1638.4, 60 sec: 1092.3, 300 sec: 1138.5). Total num frames: 6938624. Throughput: 0: 281.6. Samples: 1735163. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 07:24:40,484][00034] Avg episode reward: [(0, '-1.262')] [2024-08-05 07:24:45,483][00034] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1110.8). Total num frames: 6938624. Throughput: 0: 281.4. Samples: 1736028. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 07:24:45,484][00034] Avg episode reward: [(0, '-1.262')] [2024-08-05 07:24:50,483][00034] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1110.8). Total num frames: 6946816. Throughput: 0: 280.8. Samples: 1737661. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 07:24:50,484][00034] Avg episode reward: [(0, '-1.262')] [2024-08-05 07:24:52,916][00139] DAMAGECOUNT value on done: 77307.0 [2024-08-05 07:24:52,917][00139] Sum rewards: -5.161, reward structure: {'DEATHCOUNT': '-13.500', 'HEALTH': '-2.636', 'AMMO4': '-0.021', 'AMMO2': '-0.004', 'weapon5': '0.006', 'WEAPON1': '0.010', 'AMMO5': '0.012', 'ARMOR': '0.060', 'WEAPON5': '0.150', 'AMMO3': '0.187', 'HITCOUNT': '0.370', 'WEAPON3': '1.100', 'weapon2': '1.312', 'DAMAGECOUNT': '1.545', 'weapon3': '2.248', 'FRAGCOUNT': '4.000'} [2024-08-05 07:24:53,212][00139] DAMAGECOUNT value on done: 79864.0 [2024-08-05 07:24:53,213][00139] Sum rewards: 1.415, reward structure: {'DEATHCOUNT': '-6.000', 'HEALTH': '-0.104', 'AMMO5': '0.010', 'AMMO2': '0.025', 'AMMO3': '0.068', 'ARMOR': '0.092', 'WEAPON5': '0.100', 'AMMO4': '0.122', 'HITCOUNT': '0.140', 'weapon5': '0.184', 'WEAPON3': '0.450', 'DAMAGECOUNT': '0.450', 'weapon3': '1.278', 'weapon2': '1.600', 'FRAGCOUNT': '3.000'} [2024-08-05 07:24:55,483][00034] Fps is (10 sec: 1638.4, 60 sec: 1092.3, 300 sec: 1138.5). Total num frames: 6955008. Throughput: 0: 280.8. Samples: 1739339. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 07:24:55,484][00034] Avg episode reward: [(0, '-1.280')] [2024-08-05 07:25:00,369][00138] Updated weights for policy 0, policy_version 850 (0.0017) [2024-08-05 07:25:00,483][00034] Fps is (10 sec: 1638.4, 60 sec: 1228.8, 300 sec: 1138.5). Total num frames: 6963200. Throughput: 0: 280.9. Samples: 1740204. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 07:25:00,485][00034] Avg episode reward: [(0, '-1.280')] [2024-08-05 07:25:05,483][00034] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1110.8). Total num frames: 6963200. Throughput: 0: 282.3. Samples: 1741891. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 07:25:05,484][00034] Avg episode reward: [(0, '-1.280')] [2024-08-05 07:25:07,870][00139] DAMAGECOUNT value on done: 77486.0 [2024-08-05 07:25:07,871][00139] Sum rewards: -0.338, reward structure: {'DEATHCOUNT': '-9.000', 'HEALTH': '-1.776', 'AMMO2': '0.001', 'AMMO4': '0.004', 'AMMO5': '0.010', 'weapon4': '0.034', 'ARMOR': '0.040', 'weapon5': '0.068', 'HITCOUNT': '0.090', 'WEAPON4': '0.100', 'AMMO3': '0.108', 'WEAPON5': '0.150', 'DAMAGECOUNT': '0.537', 'WEAPON3': '0.700', 'weapon3': '1.570', 'weapon2': '2.026', 'FRAGCOUNT': '5.000'} [2024-08-05 07:25:08,105][00139] DAMAGECOUNT value on done: 80099.0 [2024-08-05 07:25:08,105][00139] Sum rewards: -5.500, reward structure: {'DEATHCOUNT': '-10.500', 'HEALTH': '-2.302', 'weapon5': '0.002', 'AMMO5': '0.003', 'AMMO2': '0.011', 'WEAPON5': '0.050', 'AMMO4': '0.053', 'weapon4': '0.106', 'AMMO3': '0.144', 'ARMOR': '0.164', 'HITCOUNT': '0.180', 'WEAPON4': '0.200', 'DAMAGECOUNT': '0.705', 'WEAPON3': '0.950', 'FRAGCOUNT': '1.000', 'weapon2': '1.510', 'weapon3': '2.224'} [2024-08-05 07:25:10,483][00034] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1138.5). Total num frames: 6971392. Throughput: 0: 282.5. Samples: 1743573. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 07:25:10,485][00034] Avg episode reward: [(0, '-1.272')] [2024-08-05 07:25:15,483][00034] Fps is (10 sec: 1638.4, 60 sec: 1228.8, 300 sec: 1138.5). Total num frames: 6979584. Throughput: 0: 282.2. Samples: 1744419. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 07:25:15,484][00034] Avg episode reward: [(0, '-1.272')] [2024-08-05 07:25:20,483][00034] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1110.8). Total num frames: 6979584. Throughput: 0: 280.6. Samples: 1746110. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 07:25:20,484][00034] Avg episode reward: [(0, '-1.272')] [2024-08-05 07:25:23,002][00139] DAMAGECOUNT value on done: 77921.0 [2024-08-05 07:25:23,003][00139] Sum rewards: -2.372, reward structure: {'DEATHCOUNT': '-9.000', 'HEALTH': '-1.492', 'AMMO2': '0.003', 'AMMO5': '0.005', 'AMMO4': '0.013', 'WEAPON5': '0.050', 'weapon5': '0.066', 'AMMO3': '0.146', 'HITCOUNT': '0.350', 'ARMOR': '0.432', 'WEAPON3': '0.850', 'DAMAGECOUNT': '1.305', 'FRAGCOUNT': '1.500', 'weapon2': '1.632', 'weapon3': '1.768'} [2024-08-05 07:25:23,216][00139] DAMAGECOUNT value on done: 80319.0 [2024-08-05 07:25:25,483][00034] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1110.8). Total num frames: 6987776. Throughput: 0: 279.5. Samples: 1747740. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 07:25:25,485][00034] Avg episode reward: [(0, '-1.317')] [2024-08-05 07:25:30,483][00034] Fps is (10 sec: 1638.4, 60 sec: 1228.8, 300 sec: 1138.5). Total num frames: 6995968. Throughput: 0: 279.4. Samples: 1748599. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 07:25:30,484][00034] Avg episode reward: [(0, '-1.317')] [2024-08-05 07:25:35,483][00034] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1110.8). Total num frames: 6995968. Throughput: 0: 281.6. Samples: 1750331. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 07:25:35,485][00034] Avg episode reward: [(0, '-1.317')] [2024-08-05 07:25:35,492][00132] Saving /kaggle/working/train_dir/default_experiment/checkpoint_p0/checkpoint_000000854_6995968.pth... [2024-08-05 07:25:35,568][00132] Removing /kaggle/working/train_dir/default_experiment/checkpoint_p0/checkpoint_000000821_6725632.pth [2024-08-05 07:25:37,737][00139] DAMAGECOUNT value on done: 78292.0 [2024-08-05 07:25:37,738][00139] Sum rewards: -0.494, reward structure: {'DEATHCOUNT': '-9.000', 'HEALTH': '-0.570', 'weapon4': '0.020', 'AMMO2': '0.022', 'ARMOR': '0.062', 'WEAPON4': '0.100', 'AMMO4': '0.110', 'AMMO3': '0.137', 'HITCOUNT': '0.270', 'WEAPON3': '0.750', 'DAMAGECOUNT': '1.113', 'weapon2': '1.580', 'weapon3': '1.912', 'FRAGCOUNT': '3.000'} [2024-08-05 07:25:37,969][00139] DAMAGECOUNT value on done: 80792.0 [2024-08-05 07:25:37,969][00139] Sum rewards: -3.678, reward structure: {'DEATHCOUNT': '-9.750', 'HEALTH': '-1.341', 'AMMO5': '0.007', 'AMMO2': '0.018', 'AMMO4': '0.089', 'WEAPON5': '0.100', 'weapon5': '0.100', 'AMMO3': '0.119', 'HITCOUNT': '0.290', 'WEAPON3': '0.750', 'FRAGCOUNT': '1.000', 'weapon3': '1.364', 'DAMAGECOUNT': '1.419', 'weapon2': '2.156'} [2024-08-05 07:25:40,483][00034] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1110.8). Total num frames: 7004160. Throughput: 0: 282.0. Samples: 1752027. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 07:25:40,484][00034] Avg episode reward: [(0, '-1.340')] [2024-08-05 07:25:45,483][00034] Fps is (10 sec: 1638.4, 60 sec: 1228.8, 300 sec: 1138.5). Total num frames: 7012352. Throughput: 0: 281.8. Samples: 1752883. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 07:25:45,485][00034] Avg episode reward: [(0, '-1.340')] [2024-08-05 07:25:50,483][00034] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1110.8). Total num frames: 7012352. Throughput: 0: 281.8. Samples: 1754574. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 07:25:50,484][00034] Avg episode reward: [(0, '-1.340')] [2024-08-05 07:25:52,814][00139] DAMAGECOUNT value on done: 78377.0 [2024-08-05 07:25:53,025][00139] DAMAGECOUNT value on done: 81467.0 [2024-08-05 07:25:53,026][00139] Sum rewards: -0.661, reward structure: {'DEATHCOUNT': '-14.250', 'HEALTH': '-3.100', 'AMMO2': '0.003', 'AMMO5': '0.013', 'AMMO4': '0.014', 'weapon5': '0.058', 'WEAPON4': '0.100', 'AMMO3': '0.218', 'WEAPON5': '0.250', 'ARMOR': '0.440', 'HITCOUNT': '0.570', 'weapon2': '1.200', 'WEAPON3': '1.400', 'DAMAGECOUNT': '2.025', 'weapon3': '2.398', 'FRAGCOUNT': '8.000'} [2024-08-05 07:25:55,483][00034] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1110.8). Total num frames: 7020544. Throughput: 0: 281.2. Samples: 1756225. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 07:25:55,486][00034] Avg episode reward: [(0, '-1.342')] [2024-08-05 07:26:00,483][00034] Fps is (10 sec: 1638.4, 60 sec: 1092.3, 300 sec: 1138.5). Total num frames: 7028736. Throughput: 0: 281.7. Samples: 1757096. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 07:26:00,484][00034] Avg episode reward: [(0, '-1.342')] [2024-08-05 07:26:05,483][00034] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1110.8). Total num frames: 7028736. Throughput: 0: 281.8. Samples: 1758791. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 07:26:05,485][00034] Avg episode reward: [(0, '-1.342')] [2024-08-05 07:26:07,453][00139] DAMAGECOUNT value on done: 79068.0 [2024-08-05 07:26:07,454][00139] Sum rewards: 4.179, reward structure: {'DEATHCOUNT': '-6.000', 'HEALTH': '-0.308', 'AMMO5': '0.003', 'AMMO2': '0.012', 'WEAPON5': '0.050', 'weapon5': '0.052', 'ARMOR': '0.056', 'AMMO4': '0.060', 'weapon7': '0.074', 'AMMO3': '0.087', 'WEAPON4': '0.100', 'AMMO6': '0.120', 'AMMO7': '0.120', 'weapon4': '0.162', 'WEAPON7': '0.200', 'HITCOUNT': '0.340', 'WEAPON3': '0.550', 'weapon2': '1.698', 'weapon3': '1.730', 'DAMAGECOUNT': '2.073', 'FRAGCOUNT': '3.000'} [2024-08-05 07:26:07,700][00139] DAMAGECOUNT value on done: 81862.0 [2024-08-05 07:26:07,700][00139] Sum rewards: 1.145, reward structure: {'DEATHCOUNT': '-7.500', 'HEALTH': '-0.370', 'weapon5': '0.002', 'AMMO5': '0.003', 'AMMO2': '0.005', 'AMMO4': '0.024', 'WEAPON5': '0.050', 'ARMOR': '0.080', 'AMMO3': '0.104', 'HITCOUNT': '0.300', 'WEAPON3': '0.550', 'DAMAGECOUNT': '1.185', 'weapon3': '1.544', 'weapon2': '2.168', 'FRAGCOUNT': '3.000'} [2024-08-05 07:26:10,483][00034] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1110.8). Total num frames: 7036928. Throughput: 0: 282.7. Samples: 1760460. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 07:26:10,485][00034] Avg episode reward: [(0, '-1.289')] [2024-08-05 07:26:13,009][00138] Updated weights for policy 0, policy_version 860 (0.0017) [2024-08-05 07:26:15,483][00034] Fps is (10 sec: 1638.4, 60 sec: 1092.3, 300 sec: 1138.5). Total num frames: 7045120. Throughput: 0: 282.8. Samples: 1761324. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 07:26:15,485][00034] Avg episode reward: [(0, '-1.289')] [2024-08-05 07:26:20,483][00034] Fps is (10 sec: 1638.4, 60 sec: 1228.8, 300 sec: 1138.5). Total num frames: 7053312. Throughput: 0: 283.6. Samples: 1763091. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 07:26:20,484][00034] Avg episode reward: [(0, '-1.289')] [2024-08-05 07:26:22,203][00139] DAMAGECOUNT value on done: 79304.0 [2024-08-05 07:26:22,204][00139] Sum rewards: -1.172, reward structure: {'DEATHCOUNT': '-6.750', 'HEALTH': '-0.540', 'AMMO5': '0.007', 'AMMO2': '0.021', 'weapon5': '0.050', 'weapon4': '0.062', 'WEAPON4': '0.100', 'AMMO4': '0.106', 'AMMO3': '0.139', 'WEAPON5': '0.150', 'HITCOUNT': '0.220', 'DAMAGECOUNT': '0.708', 'WEAPON3': '0.750', 'FRAGCOUNT': '1.000', 'weapon2': '1.006', 'weapon3': '1.798'} [2024-08-05 07:26:22,452][00139] DAMAGECOUNT value on done: 82270.0 [2024-08-05 07:26:22,452][00139] Sum rewards: 1.375, reward structure: {'DEATHCOUNT': '-6.000', 'HEALTH': '-0.726', 'AMMO5': '0.003', 'AMMO2': '0.004', 'AMMO4': '0.017', 'WEAPON5': '0.050', 'WEAPON4': '0.050', 'weapon7': '0.068', 'weapon5': '0.080', 'AMMO6': '0.100', 'AMMO7': '0.100', 'WEAPON7': '0.100', 'ARMOR': '0.103', 'AMMO3': '0.121', 'weapon4': '0.190', 'HITCOUNT': '0.200', 'WEAPON3': '0.650', 'DAMAGECOUNT': '1.167', 'weapon3': '1.238', 'weapon2': '1.860', 'FRAGCOUNT': '2.000'} [2024-08-05 07:26:25,483][00034] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1110.8). Total num frames: 7053312. Throughput: 0: 281.4. Samples: 1764691. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 07:26:25,484][00034] Avg episode reward: [(0, '-1.346')] [2024-08-05 07:26:30,483][00034] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1138.5). Total num frames: 7061504. Throughput: 0: 280.9. Samples: 1765525. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 07:26:30,485][00034] Avg episode reward: [(0, '-1.346')] [2024-08-05 07:26:35,483][00034] Fps is (10 sec: 1638.4, 60 sec: 1228.8, 300 sec: 1138.5). Total num frames: 7069696. Throughput: 0: 281.8. Samples: 1767257. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 07:26:35,485][00034] Avg episode reward: [(0, '-1.346')] [2024-08-05 07:26:37,336][00139] DAMAGECOUNT value on done: 79619.0 [2024-08-05 07:26:37,336][00139] Sum rewards: -1.724, reward structure: {'DEATHCOUNT': '-7.500', 'HEALTH': '-1.225', 'AMMO4': '-0.061', 'AMMO2': '-0.012', 'AMMO5': '0.024', 'weapon5': '0.040', 'ARMOR': '0.120', 'AMMO3': '0.151', 'HITCOUNT': '0.240', 'WEAPON5': '0.300', 'WEAPON3': '0.800', 'DAMAGECOUNT': '0.945', 'weapon2': '1.356', 'FRAGCOUNT': '1.500', 'weapon3': '1.598'} [2024-08-05 07:26:37,570][00139] DAMAGECOUNT value on done: 82503.0 [2024-08-05 07:26:37,570][00139] Sum rewards: -3.930, reward structure: {'DEATHCOUNT': '-7.500', 'HEALTH': '-1.487', 'FRAGCOUNT': '-0.500', 'AMMO4': '-0.044', 'AMMO2': '-0.009', 'AMMO5': '0.006', 'weapon5': '0.094', 'ARMOR': '0.104', 'AMMO3': '0.138', 'HITCOUNT': '0.170', 'WEAPON5': '0.200', 'WEAPON3': '0.650', 'DAMAGECOUNT': '0.699', 'weapon3': '1.702', 'weapon2': '1.846'} [2024-08-05 07:26:40,483][00034] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1110.8). Total num frames: 7069696. Throughput: 0: 282.3. Samples: 1768927. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 07:26:40,484][00034] Avg episode reward: [(0, '-1.366')] [2024-08-05 07:26:45,483][00034] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1138.5). Total num frames: 7077888. Throughput: 0: 281.6. Samples: 1769766. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 07:26:45,484][00034] Avg episode reward: [(0, '-1.366')] [2024-08-05 07:26:50,483][00034] Fps is (10 sec: 1638.4, 60 sec: 1228.8, 300 sec: 1138.5). Total num frames: 7086080. Throughput: 0: 282.3. Samples: 1771493. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 07:26:50,486][00034] Avg episode reward: [(0, '-1.366')] [2024-08-05 07:26:52,070][00139] DAMAGECOUNT value on done: 79780.0 [2024-08-05 07:26:52,316][00139] DAMAGECOUNT value on done: 82775.0 [2024-08-05 07:26:52,317][00139] Sum rewards: -1.191, reward structure: {'DEATHCOUNT': '-6.750', 'HEALTH': '-1.906', 'AMMO4': '-0.071', 'AMMO2': '-0.014', 'AMMO3': '0.086', 'ARMOR': '0.112', 'HITCOUNT': '0.230', 'WEAPON3': '0.700', 'DAMAGECOUNT': '0.816', 'weapon3': '1.786', 'weapon2': '1.820', 'FRAGCOUNT': '2.000'} [2024-08-05 07:26:55,483][00034] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1110.8). Total num frames: 7086080. Throughput: 0: 282.3. Samples: 1773164. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 07:26:55,484][00034] Avg episode reward: [(0, '-1.381')] [2024-08-05 07:27:00,483][00034] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1110.8). Total num frames: 7094272. Throughput: 0: 281.0. Samples: 1773967. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 07:27:00,485][00034] Avg episode reward: [(0, '-1.381')] [2024-08-05 07:27:05,483][00034] Fps is (10 sec: 1638.4, 60 sec: 1228.8, 300 sec: 1138.5). Total num frames: 7102464. Throughput: 0: 279.0. Samples: 1775646. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 07:27:05,484][00034] Avg episode reward: [(0, '-1.381')] [2024-08-05 07:27:07,232][00139] DAMAGECOUNT value on done: 80339.0 [2024-08-05 07:27:07,233][00139] Sum rewards: 3.900, reward structure: {'DEATHCOUNT': '-9.000', 'HEALTH': '-0.682', 'AMMO4': '-0.014', 'AMMO2': '-0.003', 'weapon5': '0.002', 'AMMO5': '0.003', 'ARMOR': '0.044', 'WEAPON5': '0.050', 'WEAPON4': '0.100', 'AMMO3': '0.119', 'weapon4': '0.120', 'HITCOUNT': '0.450', 'WEAPON3': '0.700', 'weapon2': '1.490', 'DAMAGECOUNT': '1.677', 'weapon3': '1.844', 'FRAGCOUNT': '7.000'} [2024-08-05 07:27:07,456][00139] DAMAGECOUNT value on done: 83242.0 [2024-08-05 07:27:07,456][00139] Sum rewards: -6.189, reward structure: {'DEATHCOUNT': '-13.500', 'HEALTH': '-1.832', 'AMMO4': '-0.018', 'AMMO2': '-0.003', 'WEAPON1': '0.010', 'AMMO5': '0.018', 'ARMOR': '0.056', 'weapon5': '0.060', 'AMMO3': '0.194', 'WEAPON5': '0.300', 'HITCOUNT': '0.310', 'WEAPON3': '1.150', 'DAMAGECOUNT': '1.401', 'weapon2': '1.610', 'FRAGCOUNT': '2.000', 'weapon3': '2.056'} [2024-08-05 07:27:10,483][00034] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1110.8). Total num frames: 7102464. Throughput: 0: 281.3. Samples: 1777348. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 07:27:10,484][00034] Avg episode reward: [(0, '-1.340')] [2024-08-05 07:27:15,483][00034] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1110.8). Total num frames: 7110656. Throughput: 0: 281.4. Samples: 1778190. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 07:27:15,485][00034] Avg episode reward: [(0, '-1.340')] [2024-08-05 07:27:20,483][00034] Fps is (10 sec: 1638.4, 60 sec: 1092.3, 300 sec: 1138.5). Total num frames: 7118848. Throughput: 0: 281.7. Samples: 1779934. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 07:27:20,484][00034] Avg episode reward: [(0, '-1.340')] [2024-08-05 07:27:21,982][00139] DAMAGECOUNT value on done: 80774.0 [2024-08-05 07:27:21,983][00139] Sum rewards: -1.876, reward structure: {'DEATHCOUNT': '-9.000', 'HEALTH': '-2.440', 'weapon7': '0.002', 'AMMO2': '0.009', 'AMMO5': '0.015', 'AMMO4': '0.045', 'weapon5': '0.112', 'weapon4': '0.130', 'WEAPON4': '0.150', 'AMMO3': '0.192', 'AMMO6': '0.200', 'WEAPON7': '0.200', 'AMMO7': '0.200', 'WEAPON5': '0.250', 'HITCOUNT': '0.340', 'ARMOR': '0.543', 'WEAPON3': '0.900', 'DAMAGECOUNT': '1.305', 'weapon2': '1.360', 'FRAGCOUNT': '1.500', 'weapon3': '2.110'} [2024-08-05 07:27:22,218][00139] DAMAGECOUNT value on done: 83499.0 [2024-08-05 07:27:22,218][00139] Sum rewards: 1.240, reward structure: {'DEATHCOUNT': '-6.000', 'HEALTH': '-0.202', 'AMMO5': '0.005', 'AMMO2': '0.008', 'AMMO4': '0.041', 'ARMOR': '0.087', 'WEAPON5': '0.100', 'weapon5': '0.126', 'AMMO3': '0.143', 'HITCOUNT': '0.270', 'WEAPON3': '0.600', 'DAMAGECOUNT': '0.771', 'weapon3': '1.584', 'weapon2': '1.706', 'FRAGCOUNT': '2.000'} [2024-08-05 07:27:25,483][00034] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1110.8). Total num frames: 7118848. Throughput: 0: 282.3. Samples: 1781630. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 07:27:25,484][00034] Avg episode reward: [(0, '-1.372')] [2024-08-05 07:27:25,667][00138] Updated weights for policy 0, policy_version 870 (0.0017) [2024-08-05 07:27:30,483][00034] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1110.8). Total num frames: 7127040. Throughput: 0: 281.1. Samples: 1782414. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 07:27:30,485][00034] Avg episode reward: [(0, '-1.372')] [2024-08-05 07:27:35,483][00034] Fps is (10 sec: 1638.4, 60 sec: 1092.3, 300 sec: 1138.5). Total num frames: 7135232. Throughput: 0: 280.8. Samples: 1784129. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 07:27:35,485][00034] Avg episode reward: [(0, '-1.372')] [2024-08-05 07:27:35,492][00132] Saving /kaggle/working/train_dir/default_experiment/checkpoint_p0/checkpoint_000000871_7135232.pth... [2024-08-05 07:27:35,570][00132] Removing /kaggle/working/train_dir/default_experiment/checkpoint_p0/checkpoint_000000838_6864896.pth [2024-08-05 07:27:36,983][00139] DAMAGECOUNT value on done: 80989.0 [2024-08-05 07:27:37,235][00139] DAMAGECOUNT value on done: 83671.0 [2024-08-05 07:27:37,236][00139] Sum rewards: -7.844, reward structure: {'DEATHCOUNT': '-11.250', 'FRAGCOUNT': '-1.500', 'HEALTH': '-0.774', 'AMMO5': '0.010', 'WEAPON1': '0.010', 'AMMO2': '0.019', 'weapon5': '0.058', 'AMMO4': '0.096', 'WEAPON4': '0.100', 'weapon4': '0.148', 'HITCOUNT': '0.150', 'WEAPON5': '0.150', 'AMMO3': '0.190', 'ARMOR': '0.460', 'DAMAGECOUNT': '0.516', 'WEAPON3': '0.800', 'weapon3': '1.396', 'weapon2': '1.576'} [2024-08-05 07:27:40,483][00034] Fps is (10 sec: 1638.4, 60 sec: 1228.8, 300 sec: 1138.5). Total num frames: 7143424. Throughput: 0: 281.4. Samples: 1785828. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 07:27:40,484][00034] Avg episode reward: [(0, '-1.431')] [2024-08-05 07:27:45,483][00034] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1110.8). Total num frames: 7143424. Throughput: 0: 281.9. Samples: 1786651. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 07:27:45,484][00034] Avg episode reward: [(0, '-1.431')] [2024-08-05 07:27:50,483][00034] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1138.5). Total num frames: 7151616. Throughput: 0: 282.2. Samples: 1788345. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 07:27:50,484][00034] Avg episode reward: [(0, '-1.431')] [2024-08-05 07:27:51,884][00139] DAMAGECOUNT value on done: 81074.0 [2024-08-05 07:27:51,885][00139] Sum rewards: -5.127, reward structure: {'DEATHCOUNT': '-11.250', 'HEALTH': '-0.764', 'AMMO2': '0.007', 'weapon4': '0.012', 'AMMO4': '0.032', 'weapon7': '0.056', 'ARMOR': '0.063', 'HITCOUNT': '0.090', 'WEAPON4': '0.100', 'AMMO6': '0.120', 'AMMO7': '0.120', 'AMMO3': '0.180', 'WEAPON7': '0.200', 'DAMAGECOUNT': '0.255', 'FRAGCOUNT': '1.000', 'WEAPON3': '1.050', 'weapon2': '1.250', 'weapon3': '2.352'} [2024-08-05 07:27:52,130][00139] DAMAGECOUNT value on done: 84129.0 [2024-08-05 07:27:52,130][00139] Sum rewards: -2.749, reward structure: {'DEATHCOUNT': '-9.750', 'HEALTH': '-1.870', 'AMMO5': '0.005', 'weapon5': '0.010', 'AMMO2': '0.017', 'WEAPON1': '0.020', 'AMMO4': '0.086', 'weapon7': '0.092', 'AMMO6': '0.100', 'AMMO7': '0.100', 'WEAPON5': '0.100', 'WEAPON7': '0.100', 'AMMO3': '0.151', 'weapon4': '0.158', 'WEAPON4': '0.200', 'HITCOUNT': '0.400', 'WEAPON3': '0.800', 'weapon3': '1.324', 'DAMAGECOUNT': '1.374', 'weapon2': '1.834', 'FRAGCOUNT': '2.000'} [2024-08-05 07:27:55,483][00034] Fps is (10 sec: 1638.4, 60 sec: 1228.8, 300 sec: 1138.5). Total num frames: 7159808. Throughput: 0: 282.3. Samples: 1790053. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 07:27:55,485][00034] Avg episode reward: [(0, '-1.534')] [2024-08-05 07:28:00,483][00034] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1110.8). Total num frames: 7159808. Throughput: 0: 282.4. Samples: 1790899. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 07:28:00,484][00034] Avg episode reward: [(0, '-1.534')] [2024-08-05 07:28:05,483][00034] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1138.5). Total num frames: 7168000. Throughput: 0: 281.0. Samples: 1792579. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 07:28:05,484][00034] Avg episode reward: [(0, '-1.534')] [2024-08-05 07:28:06,773][00139] DAMAGECOUNT value on done: 81459.0 [2024-08-05 07:28:06,773][00139] Sum rewards: -1.541, reward structure: {'DEATHCOUNT': '-11.250', 'HEALTH': '-0.940', 'AMMO2': '0.008', 'AMMO5': '0.013', 'weapon4': '0.018', 'AMMO4': '0.038', 'weapon5': '0.038', 'WEAPON4': '0.050', 'AMMO3': '0.164', 'WEAPON5': '0.200', 'HITCOUNT': '0.320', 'WEAPON3': '1.000', 'weapon2': '1.146', 'DAMAGECOUNT': '1.155', 'weapon3': '2.500', 'FRAGCOUNT': '4.000'} [2024-08-05 07:28:07,020][00139] DAMAGECOUNT value on done: 84274.0 [2024-08-05 07:28:07,021][00139] Sum rewards: -1.582, reward structure: {'DEATHCOUNT': '-6.750', 'HEALTH': '-0.098', 'AMMO2': '0.005', 'AMMO5': '0.007', 'ARMOR': '0.012', 'AMMO4': '0.023', 'WEAPON4': '0.050', 'weapon5': '0.070', 'AMMO3': '0.100', 'weapon4': '0.118', 'HITCOUNT': '0.120', 'WEAPON5': '0.150', 'DAMAGECOUNT': '0.435', 'WEAPON3': '0.500', 'FRAGCOUNT': '0.500', 'weapon2': '1.538', 'weapon3': '1.638'} [2024-08-05 07:28:10,483][00034] Fps is (10 sec: 1638.4, 60 sec: 1228.8, 300 sec: 1138.5). Total num frames: 7176192. Throughput: 0: 280.8. Samples: 1794268. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 07:28:10,486][00034] Avg episode reward: [(0, '-1.574')] [2024-08-05 07:28:15,483][00034] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1110.8). Total num frames: 7176192. Throughput: 0: 282.9. Samples: 1795143. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 07:28:15,485][00034] Avg episode reward: [(0, '-1.574')] [2024-08-05 07:28:20,483][00034] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1138.5). Total num frames: 7184384. Throughput: 0: 283.4. Samples: 1796882. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 07:28:20,484][00034] Avg episode reward: [(0, '-1.574')] [2024-08-05 07:28:21,487][00139] DAMAGECOUNT value on done: 81683.0 [2024-08-05 07:28:21,488][00139] Sum rewards: -0.502, reward structure: {'DEATHCOUNT': '-6.750', 'HEALTH': '-1.336', 'AMMO2': '0.002', 'AMMO5': '0.008', 'AMMO4': '0.012', 'weapon5': '0.056', 'ARMOR': '0.076', 'weapon4': '0.098', 'WEAPON5': '0.100', 'AMMO3': '0.120', 'WEAPON4': '0.150', 'HITCOUNT': '0.210', 'DAMAGECOUNT': '0.672', 'WEAPON3': '0.850', 'weapon3': '1.436', 'weapon2': '1.794', 'FRAGCOUNT': '2.000'} [2024-08-05 07:28:21,714][00139] DAMAGECOUNT value on done: 84653.0 [2024-08-05 07:28:21,714][00139] Sum rewards: -0.637, reward structure: {'DEATHCOUNT': '-9.750', 'HEALTH': '-1.576', 'AMMO2': '0.006', 'AMMO5': '0.020', 'AMMO4': '0.030', 'ARMOR': '0.041', 'WEAPON4': '0.100', 'weapon4': '0.104', 'AMMO3': '0.181', 'HITCOUNT': '0.200', 'WEAPON5': '0.300', 'weapon5': '0.324', 'WEAPON3': '1.000', 'DAMAGECOUNT': '1.137', 'weapon2': '1.330', 'weapon3': '1.916', 'FRAGCOUNT': '4.000'} [2024-08-05 07:28:25,483][00034] Fps is (10 sec: 1638.4, 60 sec: 1228.8, 300 sec: 1138.5). Total num frames: 7192576. Throughput: 0: 283.3. Samples: 1798576. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 07:28:25,484][00034] Avg episode reward: [(0, '-1.466')] [2024-08-05 07:28:30,483][00034] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1110.8). Total num frames: 7192576. Throughput: 0: 283.9. Samples: 1799427. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 07:28:30,484][00034] Avg episode reward: [(0, '-1.466')] [2024-08-05 07:28:35,483][00034] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1110.8). Total num frames: 7200768. Throughput: 0: 282.4. Samples: 1801053. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 07:28:35,485][00034] Avg episode reward: [(0, '-1.466')] [2024-08-05 07:28:36,505][00139] DAMAGECOUNT value on done: 82012.0 [2024-08-05 07:28:36,506][00139] Sum rewards: -0.780, reward structure: {'DEATHCOUNT': '-8.250', 'HEALTH': '-0.224', 'weapon4': '0.010', 'AMMO5': '0.011', 'AMMO2': '0.015', 'ARMOR': '0.032', 'WEAPON4': '0.050', 'AMMO4': '0.073', 'weapon5': '0.104', 'weapon7': '0.136', 'AMMO3': '0.158', 'WEAPON5': '0.200', 'AMMO6': '0.200', 'AMMO7': '0.200', 'WEAPON7': '0.200', 'HITCOUNT': '0.230', 'WEAPON3': '0.650', 'DAMAGECOUNT': '0.987', 'weapon2': '1.134', 'FRAGCOUNT': '1.500', 'weapon3': '1.804'} [2024-08-05 07:28:36,730][00139] DAMAGECOUNT value on done: 84874.0 [2024-08-05 07:28:36,730][00139] Sum rewards: -0.719, reward structure: {'DEATHCOUNT': '-9.000', 'HEALTH': '-0.308', 'AMMO4': '-0.052', 'AMMO2': '-0.010', 'AMMO5': '0.010', 'ARMOR': '0.044', 'weapon5': '0.096', 'AMMO3': '0.164', 'HITCOUNT': '0.180', 'WEAPON5': '0.200', 'DAMAGECOUNT': '0.663', 'WEAPON3': '0.700', 'weapon2': '1.632', 'weapon3': '1.962', 'FRAGCOUNT': '3.000'} [2024-08-05 07:28:38,356][00138] Updated weights for policy 0, policy_version 880 (0.0017) [2024-08-05 07:28:40,483][00034] Fps is (10 sec: 1638.4, 60 sec: 1092.3, 300 sec: 1138.5). Total num frames: 7208960. Throughput: 0: 281.8. Samples: 1802734. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 07:28:40,484][00034] Avg episode reward: [(0, '-1.436')] [2024-08-05 07:28:45,483][00034] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1110.8). Total num frames: 7208960. Throughput: 0: 282.3. Samples: 1803603. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 07:28:45,484][00034] Avg episode reward: [(0, '-1.436')] [2024-08-05 07:28:50,483][00034] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1110.8). Total num frames: 7217152. Throughput: 0: 282.8. Samples: 1805306. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 07:28:50,484][00034] Avg episode reward: [(0, '-1.436')] [2024-08-05 07:28:51,369][00139] DAMAGECOUNT value on done: 82332.0 [2024-08-05 07:28:51,592][00139] DAMAGECOUNT value on done: 85300.0 [2024-08-05 07:28:51,593][00139] Sum rewards: -2.604, reward structure: {'DEATHCOUNT': '-10.500', 'HEALTH': '-1.590', 'AMMO4': '-0.062', 'AMMO2': '-0.012', 'AMMO5': '0.023', 'AMMO3': '0.183', 'weapon5': '0.196', 'HITCOUNT': '0.200', 'ARMOR': '0.477', 'WEAPON5': '0.600', 'WEAPON3': '1.050', 'DAMAGECOUNT': '1.278', 'weapon2': '1.424', 'FRAGCOUNT': '2.000', 'weapon3': '2.128'} [2024-08-05 07:28:55,483][00034] Fps is (10 sec: 1638.4, 60 sec: 1092.3, 300 sec: 1138.5). Total num frames: 7225344. Throughput: 0: 282.8. Samples: 1806992. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 07:28:55,484][00034] Avg episode reward: [(0, '-1.470')] [2024-08-05 07:29:00,483][00034] Fps is (10 sec: 1638.4, 60 sec: 1228.8, 300 sec: 1138.5). Total num frames: 7233536. Throughput: 0: 282.4. Samples: 1807851. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 07:29:00,484][00034] Avg episode reward: [(0, '-1.470')] [2024-08-05 07:29:05,483][00034] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1110.8). Total num frames: 7233536. Throughput: 0: 281.1. Samples: 1809533. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 07:29:05,484][00034] Avg episode reward: [(0, '-1.470')] [2024-08-05 07:29:06,185][00139] DAMAGECOUNT value on done: 82637.0 [2024-08-05 07:29:06,185][00139] Sum rewards: -0.683, reward structure: {'DEATHCOUNT': '-9.000', 'HEALTH': '-0.750', 'AMMO5': '0.008', 'AMMO2': '0.009', 'ARMOR': '0.012', 'AMMO4': '0.043', 'weapon4': '0.086', 'WEAPON4': '0.100', 'AMMO3': '0.124', 'WEAPON5': '0.150', 'weapon5': '0.156', 'HITCOUNT': '0.250', 'WEAPON3': '0.700', 'DAMAGECOUNT': '0.915', 'weapon2': '1.674', 'weapon3': '1.840', 'FRAGCOUNT': '3.000'} [2024-08-05 07:29:06,420][00139] DAMAGECOUNT value on done: 85490.0 [2024-08-05 07:29:06,420][00139] Sum rewards: -1.926, reward structure: {'DEATHCOUNT': '-7.500', 'HEALTH': '-1.300', 'AMMO4': '-0.030', 'AMMO2': '-0.006', 'AMMO5': '0.012', 'WEAPON4': '0.050', 'weapon4': '0.056', 'weapon5': '0.080', 'AMMO3': '0.141', 'HITCOUNT': '0.190', 'WEAPON5': '0.200', 'ARMOR': '0.472', 'DAMAGECOUNT': '0.570', 'WEAPON3': '0.750', 'FRAGCOUNT': '1.000', 'weapon3': '1.534', 'weapon2': '1.854'} [2024-08-05 07:29:10,483][00034] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1138.5). Total num frames: 7241728. Throughput: 0: 281.2. Samples: 1811229. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 07:29:10,485][00034] Avg episode reward: [(0, '-1.452')] [2024-08-05 07:29:15,483][00034] Fps is (10 sec: 1638.4, 60 sec: 1228.8, 300 sec: 1138.5). Total num frames: 7249920. Throughput: 0: 281.8. Samples: 1812107. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 07:29:15,484][00034] Avg episode reward: [(0, '-1.452')] [2024-08-05 07:29:20,483][00034] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1110.8). Total num frames: 7249920. Throughput: 0: 282.9. Samples: 1813782. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 07:29:20,484][00034] Avg episode reward: [(0, '-1.452')] [2024-08-05 07:29:21,096][00139] DAMAGECOUNT value on done: 82905.0 [2024-08-05 07:29:21,097][00139] Sum rewards: -2.478, reward structure: {'DEATHCOUNT': '-10.500', 'HEALTH': '-0.830', 'AMMO5': '0.007', 'weapon5': '0.010', 'AMMO2': '0.015', 'weapon4': '0.044', 'ARMOR': '0.045', 'AMMO4': '0.076', 'WEAPON4': '0.100', 'AMMO3': '0.112', 'WEAPON5': '0.150', 'HITCOUNT': '0.210', 'WEAPON3': '0.600', 'DAMAGECOUNT': '0.804', 'weapon3': '1.484', 'weapon2': '2.194', 'FRAGCOUNT': '3.000'} [2024-08-05 07:29:21,311][00139] DAMAGECOUNT value on done: 85749.0 [2024-08-05 07:29:21,312][00139] Sum rewards: -3.120, reward structure: {'DEATHCOUNT': '-8.250', 'HEALTH': '-1.404', 'AMMO4': '-0.005', 'AMMO2': '-0.001', 'ARMOR': '0.044', 'AMMO3': '0.164', 'HITCOUNT': '0.280', 'DAMAGECOUNT': '0.777', 'WEAPON3': '0.900', 'FRAGCOUNT': '1.000', 'weapon2': '1.572', 'weapon3': '1.802'} [2024-08-05 07:29:25,483][00034] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1138.5). Total num frames: 7258112. Throughput: 0: 283.4. Samples: 1815485. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 07:29:25,485][00034] Avg episode reward: [(0, '-1.476')] [2024-08-05 07:29:30,483][00034] Fps is (10 sec: 1638.4, 60 sec: 1228.8, 300 sec: 1138.5). Total num frames: 7266304. Throughput: 0: 284.2. Samples: 1816390. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 07:29:30,484][00034] Avg episode reward: [(0, '-1.476')] [2024-08-05 07:29:35,483][00034] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1110.8). Total num frames: 7266304. Throughput: 0: 282.6. Samples: 1818025. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 07:29:35,485][00034] Avg episode reward: [(0, '-1.476')] [2024-08-05 07:29:35,493][00132] Saving /kaggle/working/train_dir/default_experiment/checkpoint_p0/checkpoint_000000887_7266304.pth... [2024-08-05 07:29:35,570][00132] Removing /kaggle/working/train_dir/default_experiment/checkpoint_p0/checkpoint_000000854_6995968.pth [2024-08-05 07:29:35,915][00139] DAMAGECOUNT value on done: 83162.0 [2024-08-05 07:29:35,915][00139] Sum rewards: -2.360, reward structure: {'DEATHCOUNT': '-6.750', 'HEALTH': '-1.408', 'AMMO2': '0.002', 'AMMO5': '0.007', 'AMMO4': '0.010', 'ARMOR': '0.016', 'weapon4': '0.046', 'AMMO3': '0.089', 'WEAPON4': '0.100', 'WEAPON5': '0.100', 'HITCOUNT': '0.140', 'weapon5': '0.166', 'FRAGCOUNT': '0.500', 'WEAPON3': '0.650', 'DAMAGECOUNT': '0.771', 'weapon3': '1.546', 'weapon2': '1.654'} [2024-08-05 07:29:36,132][00139] DAMAGECOUNT value on done: 86099.0 [2024-08-05 07:29:36,132][00139] Sum rewards: 0.440, reward structure: {'DEATHCOUNT': '-6.750', 'HEALTH': '-0.729', 'AMMO2': '0.009', 'AMMO5': '0.013', 'ARMOR': '0.020', 'AMMO4': '0.047', 'AMMO3': '0.069', 'HITCOUNT': '0.140', 'WEAPON4': '0.150', 'weapon4': '0.212', 'weapon5': '0.264', 'WEAPON5': '0.350', 'WEAPON3': '0.350', 'weapon3': '0.634', 'DAMAGECOUNT': '1.050', 'weapon2': '1.610', 'FRAGCOUNT': '3.000'} [2024-08-05 07:29:40,483][00034] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1138.5). Total num frames: 7274496. Throughput: 0: 282.2. Samples: 1819690. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 07:29:40,484][00034] Avg episode reward: [(0, '-1.428')] [2024-08-05 07:29:45,483][00034] Fps is (10 sec: 1638.4, 60 sec: 1228.8, 300 sec: 1138.5). Total num frames: 7282688. Throughput: 0: 282.9. Samples: 1820580. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 07:29:45,484][00034] Avg episode reward: [(0, '-1.428')] [2024-08-05 07:29:50,483][00034] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1110.8). Total num frames: 7282688. Throughput: 0: 282.9. Samples: 1822265. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 07:29:50,484][00034] Avg episode reward: [(0, '-1.428')] [2024-08-05 07:29:50,802][00138] Updated weights for policy 0, policy_version 890 (0.0017) [2024-08-05 07:29:50,828][00139] DAMAGECOUNT value on done: 83647.0 [2024-08-05 07:29:50,828][00139] Sum rewards: -3.408, reward structure: {'DEATHCOUNT': '-12.000', 'HEALTH': '-1.865', 'AMMO2': '0.014', 'AMMO5': '0.029', 'weapon7': '0.054', 'AMMO4': '0.069', 'weapon5': '0.078', 'AMMO6': '0.120', 'AMMO7': '0.120', 'WEAPON7': '0.200', 'AMMO3': '0.208', 'weapon4': '0.210', 'WEAPON4': '0.250', 'HITCOUNT': '0.370', 'WEAPON5': '0.450', 'ARMOR': '0.487', 'WEAPON3': '1.100', 'weapon2': '1.130', 'DAMAGECOUNT': '1.455', 'FRAGCOUNT': '2.000', 'weapon3': '2.114'} [2024-08-05 07:29:51,037][00139] DAMAGECOUNT value on done: 86601.0 [2024-08-05 07:29:51,037][00139] Sum rewards: 1.217, reward structure: {'DEATHCOUNT': '-8.250', 'HEALTH': '-0.526', 'AMMO2': '0.003', 'ARMOR': '0.008', 'AMMO4': '0.014', 'AMMO5': '0.015', 'WEAPON4': '0.050', 'weapon5': '0.108', 'AMMO3': '0.115', 'WEAPON5': '0.200', 'HITCOUNT': '0.280', 'WEAPON3': '0.600', 'DAMAGECOUNT': '1.506', 'weapon2': '1.796', 'weapon3': '1.798', 'FRAGCOUNT': '3.500'} [2024-08-05 07:29:55,483][00034] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1110.8). Total num frames: 7290880. Throughput: 0: 283.9. Samples: 1824004. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 07:29:55,485][00034] Avg episode reward: [(0, '-1.404')] [2024-08-05 07:30:00,483][00034] Fps is (10 sec: 1638.4, 60 sec: 1092.3, 300 sec: 1138.5). Total num frames: 7299072. Throughput: 0: 282.5. Samples: 1824821. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 07:30:00,485][00034] Avg episode reward: [(0, '-1.404')] [2024-08-05 07:30:05,483][00034] Fps is (10 sec: 1638.4, 60 sec: 1228.8, 300 sec: 1138.5). Total num frames: 7307264. Throughput: 0: 283.9. Samples: 1826556. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 07:30:05,485][00034] Avg episode reward: [(0, '-1.404')] [2024-08-05 07:30:05,694][00139] DAMAGECOUNT value on done: 83736.0 [2024-08-05 07:30:05,695][00139] Sum rewards: -2.023, reward structure: {'DEATHCOUNT': '-6.000', 'HEALTH': '-1.334', 'FRAGCOUNT': '-0.500', 'AMMO2': '0.003', 'AMMO5': '0.010', 'AMMO4': '0.016', 'HITCOUNT': '0.080', 'AMMO6': '0.100', 'AMMO7': '0.100', 'WEAPON7': '0.100', 'WEAPON5': '0.100', 'weapon7': '0.114', 'AMMO3': '0.135', 'weapon5': '0.148', 'WEAPON4': '0.200', 'DAMAGECOUNT': '0.267', 'weapon4': '0.436', 'ARMOR': '0.516', 'WEAPON3': '0.700', 'weapon2': '1.140', 'weapon3': '1.646'} [2024-08-05 07:30:05,940][00139] DAMAGECOUNT value on done: 86961.0 [2024-08-05 07:30:05,940][00139] Sum rewards: -2.379, reward structure: {'DEATHCOUNT': '-9.000', 'HEALTH': '-1.655', 'AMMO5': '0.005', 'AMMO2': '0.011', 'ARMOR': '0.020', 'WEAPON4': '0.050', 'weapon7': '0.052', 'AMMO4': '0.055', 'weapon4': '0.074', 'AMMO6': '0.100', 'AMMO7': '0.100', 'WEAPON7': '0.100', 'AMMO3': '0.158', 'HITCOUNT': '0.260', 'WEAPON3': '1.000', 'DAMAGECOUNT': '1.080', 'weapon2': '1.302', 'weapon3': '1.908', 'FRAGCOUNT': '2.000'} [2024-08-05 07:30:10,483][00034] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1110.8). Total num frames: 7307264. Throughput: 0: 281.9. Samples: 1828170. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 07:30:10,484][00034] Avg episode reward: [(0, '-1.403')] [2024-08-05 07:30:15,483][00034] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1138.5). Total num frames: 7315456. Throughput: 0: 280.4. Samples: 1829006. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 07:30:15,484][00034] Avg episode reward: [(0, '-1.403')] [2024-08-05 07:30:20,483][00034] Fps is (10 sec: 1638.4, 60 sec: 1228.8, 300 sec: 1138.5). Total num frames: 7323648. Throughput: 0: 281.1. Samples: 1830675. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 07:30:20,484][00034] Avg episode reward: [(0, '-1.403')] [2024-08-05 07:30:20,852][00139] DAMAGECOUNT value on done: 84049.0 [2024-08-05 07:30:20,853][00139] Sum rewards: -4.758, reward structure: {'DEATHCOUNT': '-10.500', 'HEALTH': '-3.381', 'weapon5': '0.004', 'AMMO2': '0.009', 'AMMO5': '0.010', 'weapon4': '0.010', 'AMMO4': '0.043', 'ARMOR': '0.076', 'weapon7': '0.096', 'WEAPON4': '0.100', 'WEAPON5': '0.100', 'AMMO3': '0.192', 'HITCOUNT': '0.210', 'AMMO6': '0.320', 'AMMO7': '0.320', 'WEAPON7': '0.400', 'DAMAGECOUNT': '0.939', 'WEAPON3': '1.100', 'weapon3': '1.404', 'weapon2': '1.790', 'FRAGCOUNT': '2.000'} [2024-08-05 07:30:21,098][00139] DAMAGECOUNT value on done: 87580.0 [2024-08-05 07:30:21,098][00139] Sum rewards: 0.887, reward structure: {'DEATHCOUNT': '-9.750', 'HEALTH': '-1.694', 'AMMO4': '-0.061', 'AMMO2': '-0.012', 'AMMO5': '0.010', 'weapon4': '0.038', 'WEAPON4': '0.050', 'ARMOR': '0.068', 'weapon5': '0.108', 'AMMO3': '0.142', 'WEAPON5': '0.200', 'HITCOUNT': '0.480', 'WEAPON3': '0.700', 'weapon2': '1.770', 'DAMAGECOUNT': '1.857', 'weapon3': '1.980', 'FRAGCOUNT': '5.000'} [2024-08-05 07:30:25,483][00034] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1110.8). Total num frames: 7323648. Throughput: 0: 282.2. Samples: 1832388. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 07:30:25,484][00034] Avg episode reward: [(0, '-1.424')] [2024-08-05 07:30:30,483][00034] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1138.5). Total num frames: 7331840. Throughput: 0: 280.8. Samples: 1833217. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 07:30:30,485][00034] Avg episode reward: [(0, '-1.424')] [2024-08-05 07:30:35,356][00139] DAMAGECOUNT value on done: 84363.0 [2024-08-05 07:30:35,356][00139] Sum rewards: -5.531, reward structure: {'DEATHCOUNT': '-9.750', 'HEALTH': '-1.374', 'FRAGCOUNT': '-0.500', 'AMMO2': '0.003', 'AMMO5': '0.009', 'AMMO4': '0.016', 'WEAPON1': '0.020', 'ARMOR': '0.060', 'weapon5': '0.090', 'AMMO3': '0.162', 'WEAPON5': '0.200', 'HITCOUNT': '0.210', 'DAMAGECOUNT': '0.942', 'WEAPON3': '1.000', 'weapon2': '1.296', 'weapon3': '2.084'} [2024-08-05 07:30:35,483][00034] Fps is (10 sec: 1638.4, 60 sec: 1228.8, 300 sec: 1138.5). Total num frames: 7340032. Throughput: 0: 282.7. Samples: 1834986. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 07:30:35,484][00034] Avg episode reward: [(0, '-1.422')] [2024-08-05 07:30:35,612][00139] DAMAGECOUNT value on done: 87979.0 [2024-08-05 07:30:35,613][00139] Sum rewards: -0.866, reward structure: {'DEATHCOUNT': '-9.750', 'HEALTH': '-0.750', 'AMMO5': '0.003', 'AMMO2': '0.008', 'AMMO4': '0.040', 'ARMOR': '0.046', 'WEAPON4': '0.050', 'WEAPON5': '0.100', 'weapon5': '0.110', 'weapon4': '0.128', 'AMMO3': '0.158', 'HITCOUNT': '0.270', 'WEAPON3': '0.950', 'weapon2': '1.182', 'DAMAGECOUNT': '1.197', 'weapon3': '2.392', 'FRAGCOUNT': '3.000'} [2024-08-05 07:30:40,483][00034] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1110.8). Total num frames: 7340032. Throughput: 0: 280.5. Samples: 1836627. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 07:30:40,484][00034] Avg episode reward: [(0, '-1.457')] [2024-08-05 07:30:45,483][00034] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1138.5). Total num frames: 7348224. Throughput: 0: 281.2. Samples: 1837477. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 07:30:45,487][00034] Avg episode reward: [(0, '-1.457')] [2024-08-05 07:30:50,481][00139] DAMAGECOUNT value on done: 84577.0 [2024-08-05 07:30:50,481][00139] Sum rewards: -3.982, reward structure: {'DEATHCOUNT': '-8.250', 'HEALTH': '-1.068', 'FRAGCOUNT': '-0.500', 'AMMO5': '0.007', 'AMMO2': '0.016', 'ARMOR': '0.072', 'AMMO4': '0.081', 'HITCOUNT': '0.100', 'WEAPON5': '0.100', 'weapon5': '0.102', 'AMMO3': '0.121', 'WEAPON4': '0.150', 'weapon4': '0.198', 'DAMAGECOUNT': '0.642', 'WEAPON3': '0.800', 'weapon2': '1.426', 'weapon3': '2.020'} [2024-08-05 07:30:50,483][00034] Fps is (10 sec: 1638.4, 60 sec: 1228.8, 300 sec: 1138.5). Total num frames: 7356416. Throughput: 0: 280.1. Samples: 1839159. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 07:30:50,484][00034] Avg episode reward: [(0, '-1.457')] [2024-08-05 07:30:50,732][00139] DAMAGECOUNT value on done: 88449.0 [2024-08-05 07:30:50,733][00139] Sum rewards: -1.347, reward structure: {'DEATHCOUNT': '-9.000', 'HEALTH': '-2.100', 'AMMO5': '0.003', 'AMMO2': '0.005', 'WEAPON1': '0.020', 'AMMO4': '0.025', 'ARMOR': '0.036', 'weapon4': '0.042', 'weapon5': '0.094', 'WEAPON5': '0.100', 'WEAPON4': '0.100', 'AMMO3': '0.131', 'HITCOUNT': '0.390', 'WEAPON3': '0.700', 'DAMAGECOUNT': '1.410', 'weapon3': '1.552', 'weapon2': '2.146', 'FRAGCOUNT': '3.000'} [2024-08-05 07:30:55,483][00034] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1110.8). Total num frames: 7356416. Throughput: 0: 281.8. Samples: 1840852. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 07:30:55,484][00034] Avg episode reward: [(0, '-1.466')] [2024-08-05 07:31:00,483][00034] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1138.5). Total num frames: 7364608. Throughput: 0: 281.6. Samples: 1841676. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 07:31:00,484][00034] Avg episode reward: [(0, '-1.466')] [2024-08-05 07:31:03,505][00138] Updated weights for policy 0, policy_version 900 (0.0017) [2024-08-05 07:31:05,385][00139] DAMAGECOUNT value on done: 84885.0 [2024-08-05 07:31:05,386][00139] Sum rewards: 0.189, reward structure: {'DEATHCOUNT': '-6.750', 'HEALTH': '-0.172', 'AMMO2': '0.006', 'AMMO5': '0.010', 'AMMO4': '0.027', 'ARMOR': '0.028', 'weapon5': '0.030', 'WEAPON4': '0.050', 'weapon4': '0.076', 'weapon7': '0.080', 'AMMO6': '0.100', 'AMMO7': '0.100', 'WEAPON7': '0.100', 'AMMO3': '0.106', 'WEAPON5': '0.150', 'HITCOUNT': '0.260', 'WEAPON3': '0.650', 'DAMAGECOUNT': '0.924', 'weapon2': '0.972', 'FRAGCOUNT': '1.500', 'weapon3': '1.942'} [2024-08-05 07:31:05,483][00034] Fps is (10 sec: 1638.4, 60 sec: 1092.3, 300 sec: 1138.5). Total num frames: 7372800. Throughput: 0: 282.6. Samples: 1843391. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 07:31:05,484][00034] Avg episode reward: [(0, '-1.459')] [2024-08-05 07:31:05,628][00139] DAMAGECOUNT value on done: 88841.0 [2024-08-05 07:31:05,629][00139] Sum rewards: 2.224, reward structure: {'DEATHCOUNT': '-6.750', 'HEALTH': '-0.578', 'ARMOR': '0.012', 'AMMO5': '0.012', 'AMMO2': '0.015', 'WEAPON4': '0.050', 'weapon4': '0.068', 'AMMO4': '0.075', 'weapon7': '0.086', 'AMMO3': '0.097', 'AMMO6': '0.100', 'AMMO7': '0.100', 'WEAPON7': '0.100', 'weapon5': '0.128', 'WEAPON5': '0.150', 'HITCOUNT': '0.280', 'WEAPON3': '0.600', 'DAMAGECOUNT': '1.176', 'weapon2': '1.584', 'weapon3': '1.918', 'FRAGCOUNT': '3.000'} [2024-08-05 07:31:10,483][00034] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1110.8). Total num frames: 7372800. Throughput: 0: 280.0. Samples: 1844990. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 07:31:10,484][00034] Avg episode reward: [(0, '-1.399')] [2024-08-05 07:31:15,483][00034] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1110.8). Total num frames: 7380992. Throughput: 0: 280.9. Samples: 1845856. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 07:31:15,484][00034] Avg episode reward: [(0, '-1.399')] [2024-08-05 07:31:20,483][00034] Fps is (10 sec: 1638.4, 60 sec: 1092.3, 300 sec: 1138.5). Total num frames: 7389184. Throughput: 0: 278.4. Samples: 1847516. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 07:31:20,485][00034] Avg episode reward: [(0, '-1.399')] [2024-08-05 07:31:20,596][00139] DAMAGECOUNT value on done: 85287.0 [2024-08-05 07:31:20,597][00139] Sum rewards: 3.912, reward structure: {'DEATHCOUNT': '-3.750', 'HEALTH': '-0.150', 'AMMO4': '-0.013', 'AMMO2': '-0.003', 'AMMO5': '0.007', 'AMMO3': '0.059', 'WEAPON5': '0.100', 'HITCOUNT': '0.200', 'weapon5': '0.220', 'WEAPON3': '0.250', 'ARMOR': '0.903', 'DAMAGECOUNT': '1.206', 'weapon2': '1.362', 'weapon3': '1.520', 'FRAGCOUNT': '2.000'} [2024-08-05 07:31:20,828][00139] DAMAGECOUNT value on done: 89230.0 [2024-08-05 07:31:20,829][00139] Sum rewards: -2.098, reward structure: {'DEATHCOUNT': '-8.250', 'HEALTH': '-0.512', 'AMMO2': '0.002', 'AMMO4': '0.011', 'ARMOR': '0.016', 'AMMO5': '0.020', 'WEAPON4': '0.050', 'weapon7': '0.082', 'AMMO6': '0.100', 'AMMO7': '0.100', 'WEAPON7': '0.100', 'weapon4': '0.130', 'AMMO3': '0.153', 'weapon5': '0.182', 'HITCOUNT': '0.250', 'WEAPON5': '0.250', 'FRAGCOUNT': '0.500', 'WEAPON3': '0.650', 'DAMAGECOUNT': '1.167', 'weapon2': '1.372', 'weapon3': '1.528'} [2024-08-05 07:31:25,483][00034] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1110.8). Total num frames: 7389184. Throughput: 0: 279.4. Samples: 1849202. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 07:31:25,484][00034] Avg episode reward: [(0, '-1.396')] [2024-08-05 07:31:30,483][00034] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1110.8). Total num frames: 7397376. Throughput: 0: 279.4. Samples: 1850049. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 07:31:30,484][00034] Avg episode reward: [(0, '-1.396')] [2024-08-05 07:31:35,470][00139] DAMAGECOUNT value on done: 85478.0 [2024-08-05 07:31:35,470][00139] Sum rewards: 0.649, reward structure: {'DEATHCOUNT': '-6.000', 'AMMO2': '0.016', 'WEAPON4': '0.050', 'AMMO4': '0.082', 'AMMO3': '0.098', 'HEALTH': '0.144', 'HITCOUNT': '0.180', 'ARMOR': '0.448', 'WEAPON3': '0.500', 'DAMAGECOUNT': '0.573', 'weapon2': '1.176', 'weapon3': '1.382', 'FRAGCOUNT': '2.000'} [2024-08-05 07:31:35,483][00034] Fps is (10 sec: 1638.4, 60 sec: 1092.3, 300 sec: 1138.5). Total num frames: 7405568. Throughput: 0: 279.9. Samples: 1851756. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 07:31:35,484][00034] Avg episode reward: [(0, '-1.408')] [2024-08-05 07:31:35,493][00132] Saving /kaggle/working/train_dir/default_experiment/checkpoint_p0/checkpoint_000000904_7405568.pth... [2024-08-05 07:31:35,567][00132] Removing /kaggle/working/train_dir/default_experiment/checkpoint_p0/checkpoint_000000871_7135232.pth [2024-08-05 07:31:35,710][00139] DAMAGECOUNT value on done: 89644.0 [2024-08-05 07:31:35,711][00139] Sum rewards: 2.080, reward structure: {'DEATHCOUNT': '-6.750', 'HEALTH': '-0.183', 'AMMO5': '0.003', 'AMMO2': '0.018', 'WEAPON5': '0.050', 'weapon5': '0.050', 'AMMO4': '0.092', 'ARMOR': '0.096', 'WEAPON4': '0.100', 'AMMO3': '0.104', 'weapon4': '0.234', 'HITCOUNT': '0.270', 'WEAPON3': '0.650', 'DAMAGECOUNT': '1.242', 'weapon2': '1.280', 'weapon3': '1.824', 'FRAGCOUNT': '3.000'} [2024-08-05 07:31:40,483][00034] Fps is (10 sec: 1638.4, 60 sec: 1228.8, 300 sec: 1138.5). Total num frames: 7413760. Throughput: 0: 278.2. Samples: 1853372. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 07:31:40,484][00034] Avg episode reward: [(0, '-1.383')] [2024-08-05 07:31:45,483][00034] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1110.8). Total num frames: 7413760. Throughput: 0: 278.3. Samples: 1854199. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 07:31:45,484][00034] Avg episode reward: [(0, '-1.383')] [2024-08-05 07:31:50,483][00034] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1138.5). Total num frames: 7421952. Throughput: 0: 277.7. Samples: 1855889. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 07:31:50,485][00034] Avg episode reward: [(0, '-1.383')] [2024-08-05 07:31:50,615][00139] DAMAGECOUNT value on done: 85758.0 [2024-08-05 07:31:50,616][00139] Sum rewards: -1.674, reward structure: {'DEATHCOUNT': '-9.000', 'HEALTH': '-1.064', 'weapon5': '0.008', 'AMMO5': '0.010', 'AMMO2': '0.011', 'AMMO4': '0.055', 'WEAPON4': '0.100', 'ARMOR': '0.120', 'AMMO3': '0.136', 'weapon4': '0.154', 'WEAPON5': '0.200', 'HITCOUNT': '0.230', 'DAMAGECOUNT': '0.840', 'WEAPON3': '0.900', 'weapon2': '1.446', 'FRAGCOUNT': '2.000', 'weapon3': '2.180'} [2024-08-05 07:31:50,837][00139] DAMAGECOUNT value on done: 89934.0 [2024-08-05 07:31:50,837][00139] Sum rewards: -1.989, reward structure: {'DEATHCOUNT': '-7.500', 'HEALTH': '-1.124', 'AMMO4': '-0.010', 'AMMO2': '-0.002', 'AMMO5': '0.005', 'weapon5': '0.018', 'WEAPON5': '0.100', 'AMMO3': '0.149', 'HITCOUNT': '0.180', 'WEAPON4': '0.200', 'weapon4': '0.274', 'FRAGCOUNT': '0.500', 'ARMOR': '0.600', 'WEAPON3': '0.750', 'DAMAGECOUNT': '0.870', 'weapon2': '1.362', 'weapon3': '1.638'} [2024-08-05 07:31:55,483][00034] Fps is (10 sec: 1638.4, 60 sec: 1228.8, 300 sec: 1138.5). Total num frames: 7430144. Throughput: 0: 280.7. Samples: 1857621. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 07:31:55,484][00034] Avg episode reward: [(0, '-1.403')] [2024-08-05 07:32:00,483][00034] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1110.8). Total num frames: 7430144. Throughput: 0: 281.0. Samples: 1858501. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 07:32:00,485][00034] Avg episode reward: [(0, '-1.403')] [2024-08-05 07:32:05,398][00139] DAMAGECOUNT value on done: 86053.0 [2024-08-05 07:32:05,399][00139] Sum rewards: 1.606, reward structure: {'DEATHCOUNT': '-6.750', 'HEALTH': '-1.445', 'AMMO4': '-0.043', 'AMMO2': '-0.009', 'AMMO5': '0.007', 'WEAPON1': '0.020', 'ARMOR': '0.044', 'weapon5': '0.046', 'AMMO3': '0.080', 'WEAPON5': '0.200', 'HITCOUNT': '0.260', 'WEAPON3': '0.700', 'DAMAGECOUNT': '0.885', 'weapon2': '1.772', 'weapon3': '1.838', 'FRAGCOUNT': '4.000'} [2024-08-05 07:32:05,483][00034] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1138.5). Total num frames: 7438336. Throughput: 0: 281.4. Samples: 1860179. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 07:32:05,485][00034] Avg episode reward: [(0, '-1.311')] [2024-08-05 07:32:05,636][00139] DAMAGECOUNT value on done: 90182.0 [2024-08-05 07:32:05,636][00139] Sum rewards: -7.443, reward structure: {'DEATHCOUNT': '-11.250', 'HEALTH': '-2.756', 'AMMO4': '-0.014', 'AMMO2': '-0.003', 'AMMO5': '0.018', 'ARMOR': '0.022', 'weapon5': '0.038', 'WEAPON4': '0.050', 'AMMO3': '0.144', 'HITCOUNT': '0.210', 'WEAPON5': '0.350', 'FRAGCOUNT': '0.500', 'DAMAGECOUNT': '0.744', 'WEAPON3': '1.000', 'weapon2': '1.702', 'weapon3': '1.802'} [2024-08-05 07:32:10,483][00034] Fps is (10 sec: 1638.3, 60 sec: 1228.8, 300 sec: 1138.5). Total num frames: 7446528. Throughput: 0: 281.8. Samples: 1861882. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 07:32:10,487][00034] Avg episode reward: [(0, '-1.321')] [2024-08-05 07:32:15,483][00034] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1110.8). Total num frames: 7446528. Throughput: 0: 280.4. Samples: 1862669. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 07:32:15,484][00034] Avg episode reward: [(0, '-1.321')] [2024-08-05 07:32:16,786][00138] Updated weights for policy 0, policy_version 910 (0.0018) [2024-08-05 07:32:20,440][00139] DAMAGECOUNT value on done: 86265.0 [2024-08-05 07:32:20,441][00139] Sum rewards: -7.657, reward structure: {'DEATHCOUNT': '-12.750', 'HEALTH': '-2.004', 'AMMO2': '0.001', 'AMMO4': '0.004', 'ARMOR': '0.076', 'HITCOUNT': '0.190', 'AMMO3': '0.234', 'DAMAGECOUNT': '0.636', 'FRAGCOUNT': '1.000', 'WEAPON3': '1.150', 'weapon3': '1.806', 'weapon2': '2.000'} [2024-08-05 07:32:20,483][00034] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1138.5). Total num frames: 7454720. Throughput: 0: 280.1. Samples: 1864362. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 07:32:20,484][00034] Avg episode reward: [(0, '-1.344')] [2024-08-05 07:32:20,658][00139] DAMAGECOUNT value on done: 90649.0 [2024-08-05 07:32:20,658][00139] Sum rewards: -1.911, reward structure: {'DEATHCOUNT': '-11.250', 'HEALTH': '-1.310', 'AMMO4': '-0.031', 'AMMO2': '-0.006', 'weapon4': '0.010', 'AMMO5': '0.014', 'weapon5': '0.022', 'AMMO6': '0.100', 'WEAPON7': '0.100', 'AMMO7': '0.100', 'WEAPON4': '0.100', 'AMMO3': '0.141', 'WEAPON5': '0.150', 'HITCOUNT': '0.330', 'ARMOR': '0.516', 'WEAPON3': '0.900', 'DAMAGECOUNT': '1.401', 'weapon2': '1.868', 'weapon3': '1.934', 'FRAGCOUNT': '3.000'} [2024-08-05 07:32:25,483][00034] Fps is (10 sec: 1638.4, 60 sec: 1228.8, 300 sec: 1138.5). Total num frames: 7462912. Throughput: 0: 282.3. Samples: 1866077. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 07:32:25,484][00034] Avg episode reward: [(0, '-1.360')] [2024-08-05 07:32:30,483][00034] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1110.8). Total num frames: 7462912. Throughput: 0: 282.7. Samples: 1866920. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 07:32:30,484][00034] Avg episode reward: [(0, '-1.360')] [2024-08-05 07:32:35,215][00139] DAMAGECOUNT value on done: 86595.0 [2024-08-05 07:32:35,215][00139] Sum rewards: -2.013, reward structure: {'DEATHCOUNT': '-7.500', 'HEALTH': '-2.270', 'AMMO4': '-0.068', 'AMMO2': '-0.013', 'AMMO5': '0.019', 'weapon5': '0.052', 'ARMOR': '0.062', 'AMMO3': '0.173', 'HITCOUNT': '0.230', 'WEAPON5': '0.400', 'WEAPON3': '0.650', 'DAMAGECOUNT': '0.690', 'weapon3': '1.530', 'FRAGCOUNT': '2.000', 'weapon2': '2.032'} [2024-08-05 07:32:35,452][00139] DAMAGECOUNT value on done: 90804.0 [2024-08-05 07:32:35,452][00139] Sum rewards: -4.582, reward structure: {'DEATHCOUNT': '-11.250', 'HEALTH': '-1.240', 'AMMO4': '-0.015', 'AMMO2': '-0.003', 'AMMO5': '0.005', 'ARMOR': '0.046', 'WEAPON5': '0.050', 'weapon5': '0.054', 'AMMO6': '0.100', 'WEAPON7': '0.100', 'AMMO7': '0.100', 'AMMO3': '0.157', 'HITCOUNT': '0.160', 'DAMAGECOUNT': '0.465', 'WEAPON3': '0.950', 'weapon2': '1.562', 'FRAGCOUNT': '2.000', 'weapon3': '2.176'} [2024-08-05 07:32:35,483][00034] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1110.8). Total num frames: 7471104. Throughput: 0: 283.1. Samples: 1868629. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 07:32:35,484][00034] Avg episode reward: [(0, '-1.360')] [2024-08-05 07:32:40,483][00034] Fps is (10 sec: 1638.4, 60 sec: 1092.3, 300 sec: 1138.5). Total num frames: 7479296. Throughput: 0: 282.3. Samples: 1870326. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 07:32:40,484][00034] Avg episode reward: [(0, '-1.360')] [2024-08-05 07:32:45,483][00034] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1110.8). Total num frames: 7479296. Throughput: 0: 280.5. Samples: 1871122. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 07:32:45,484][00034] Avg episode reward: [(0, '-1.360')] [2024-08-05 07:32:50,431][00139] DAMAGECOUNT value on done: 86908.0 [2024-08-05 07:32:50,432][00139] Sum rewards: -3.369, reward structure: {'DEATHCOUNT': '-11.250', 'HEALTH': '-2.026', 'AMMO2': '0.000', 'AMMO4': '0.001', 'AMMO5': '0.003', 'weapon4': '0.032', 'WEAPON5': '0.050', 'weapon5': '0.052', 'WEAPON4': '0.100', 'AMMO3': '0.185', 'HITCOUNT': '0.310', 'ARMOR': '0.501', 'DAMAGECOUNT': '0.939', 'WEAPON3': '1.050', 'weapon3': '1.742', 'weapon2': '1.942', 'FRAGCOUNT': '3.000'} [2024-08-05 07:32:50,483][00034] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1110.8). Total num frames: 7487488. Throughput: 0: 280.0. Samples: 1872781. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 07:32:50,484][00034] Avg episode reward: [(0, '-1.454')] [2024-08-05 07:32:50,668][00139] DAMAGECOUNT value on done: 91235.0 [2024-08-05 07:32:50,669][00139] Sum rewards: -6.194, reward structure: {'DEATHCOUNT': '-12.000', 'HEALTH': '-2.014', 'weapon5': '0.006', 'AMMO2': '0.007', 'AMMO5': '0.010', 'AMMO4': '0.033', 'ARMOR': '0.092', 'AMMO3': '0.169', 'WEAPON5': '0.200', 'HITCOUNT': '0.360', 'FRAGCOUNT': '1.000', 'WEAPON3': '1.100', 'DAMAGECOUNT': '1.293', 'weapon2': '1.640', 'weapon3': '1.910'} [2024-08-05 07:32:55,483][00034] Fps is (10 sec: 1638.4, 60 sec: 1092.3, 300 sec: 1138.5). Total num frames: 7495680. Throughput: 0: 279.5. Samples: 1874461. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 07:32:55,485][00034] Avg episode reward: [(0, '-1.515')] [2024-08-05 07:33:00,483][00034] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1110.8). Total num frames: 7495680. Throughput: 0: 280.8. Samples: 1875305. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 07:33:00,484][00034] Avg episode reward: [(0, '-1.515')] [2024-08-05 07:33:05,275][00139] DAMAGECOUNT value on done: 87018.0 [2024-08-05 07:33:05,276][00139] Sum rewards: -3.871, reward structure: {'DEATHCOUNT': '-6.750', 'HEALTH': '-1.844', 'FRAGCOUNT': '-0.500', 'AMMO2': '0.001', 'AMMO5': '0.005', 'AMMO4': '0.005', 'weapon5': '0.022', 'WEAPON4': '0.050', 'WEAPON5': '0.050', 'HITCOUNT': '0.060', 'AMMO3': '0.119', 'weapon4': '0.132', 'DAMAGECOUNT': '0.330', 'WEAPON3': '0.600', 'ARMOR': '0.906', 'weapon3': '1.262', 'weapon2': '1.680'} [2024-08-05 07:33:05,483][00034] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1110.8). Total num frames: 7503872. Throughput: 0: 281.2. Samples: 1877014. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 07:33:05,484][00034] Avg episode reward: [(0, '-1.570')] [2024-08-05 07:33:05,507][00139] DAMAGECOUNT value on done: 91618.0 [2024-08-05 07:33:05,508][00139] Sum rewards: -2.385, reward structure: {'DEATHCOUNT': '-9.000', 'HEALTH': '-2.004', 'AMMO4': '-0.016', 'AMMO2': '-0.003', 'AMMO5': '0.010', 'weapon5': '0.048', 'ARMOR': '0.080', 'AMMO3': '0.159', 'WEAPON5': '0.200', 'HITCOUNT': '0.350', 'WEAPON3': '0.950', 'DAMAGECOUNT': '1.149', 'weapon2': '1.456', 'FRAGCOUNT': '2.000', 'weapon3': '2.236'} [2024-08-05 07:33:10,483][00034] Fps is (10 sec: 1638.4, 60 sec: 1092.3, 300 sec: 1138.5). Total num frames: 7512064. Throughput: 0: 280.8. Samples: 1878714. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 07:33:10,484][00034] Avg episode reward: [(0, '-1.627')] [2024-08-05 07:33:15,483][00034] Fps is (10 sec: 1638.4, 60 sec: 1228.8, 300 sec: 1138.5). Total num frames: 7520256. Throughput: 0: 282.0. Samples: 1879608. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 07:33:15,485][00034] Avg episode reward: [(0, '-1.627')] [2024-08-05 07:33:20,090][00139] DAMAGECOUNT value on done: 87163.0 [2024-08-05 07:33:20,326][00139] DAMAGECOUNT value on done: 91946.0 [2024-08-05 07:33:20,326][00139] Sum rewards: -3.457, reward structure: {'DEATHCOUNT': '-12.000', 'HEALTH': '-0.464', 'AMMO5': '0.005', 'AMMO2': '0.014', 'WEAPON5': '0.050', 'AMMO4': '0.070', 'weapon5': '0.076', 'AMMO3': '0.153', 'HITCOUNT': '0.250', 'ARMOR': '0.478', 'WEAPON3': '0.850', 'DAMAGECOUNT': '0.984', 'weapon2': '1.542', 'weapon3': '2.034', 'FRAGCOUNT': '2.500'} [2024-08-05 07:33:20,483][00034] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1110.8). Total num frames: 7520256. Throughput: 0: 280.6. Samples: 1881256. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 07:33:20,484][00034] Avg episode reward: [(0, '-1.688')] [2024-08-05 07:33:25,483][00034] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1138.5). Total num frames: 7528448. Throughput: 0: 280.6. Samples: 1882952. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 07:33:25,485][00034] Avg episode reward: [(0, '-1.688')] [2024-08-05 07:33:29,526][00138] Updated weights for policy 0, policy_version 920 (0.0017) [2024-08-05 07:33:30,483][00034] Fps is (10 sec: 1638.4, 60 sec: 1228.8, 300 sec: 1138.5). Total num frames: 7536640. Throughput: 0: 282.0. Samples: 1883812. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 07:33:30,485][00034] Avg episode reward: [(0, '-1.688')] [2024-08-05 07:33:35,084][00139] DAMAGECOUNT value on done: 87515.0 [2024-08-05 07:33:35,085][00139] Sum rewards: 2.790, reward structure: {'DEATHCOUNT': '-5.250', 'HEALTH': '-0.291', 'AMMO4': '-0.005', 'AMMO2': '-0.001', 'AMMO5': '0.014', 'weapon7': '0.056', 'AMMO3': '0.080', 'AMMO6': '0.120', 'AMMO7': '0.120', 'weapon5': '0.170', 'WEAPON7': '0.200', 'WEAPON5': '0.250', 'HITCOUNT': '0.270', 'WEAPON3': '0.400', 'ARMOR': '0.552', 'DAMAGECOUNT': '1.056', 'weapon3': '1.204', 'weapon2': '1.844', 'FRAGCOUNT': '2.000'} [2024-08-05 07:33:35,331][00139] DAMAGECOUNT value on done: 92126.0 [2024-08-05 07:33:35,332][00139] Sum rewards: -2.681, reward structure: {'DEATHCOUNT': '-9.750', 'HEALTH': '-0.770', 'AMMO2': '0.007', 'AMMO4': '0.035', 'AMMO3': '0.132', 'HITCOUNT': '0.170', 'ARMOR': '0.533', 'DAMAGECOUNT': '0.540', 'WEAPON3': '0.800', 'weapon2': '1.626', 'weapon3': '1.996', 'FRAGCOUNT': '2.000'} [2024-08-05 07:33:35,483][00034] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1110.8). Total num frames: 7536640. Throughput: 0: 282.0. Samples: 1885470. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 07:33:35,485][00034] Avg episode reward: [(0, '-1.713')] [2024-08-05 07:33:35,493][00132] Saving /kaggle/working/train_dir/default_experiment/checkpoint_p0/checkpoint_000000920_7536640.pth... [2024-08-05 07:33:35,570][00132] Removing /kaggle/working/train_dir/default_experiment/checkpoint_p0/checkpoint_000000887_7266304.pth [2024-08-05 07:33:40,483][00034] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1138.5). Total num frames: 7544832. Throughput: 0: 282.5. Samples: 1887174. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 07:33:40,485][00034] Avg episode reward: [(0, '-1.713')] [2024-08-05 07:33:45,483][00034] Fps is (10 sec: 1638.4, 60 sec: 1228.8, 300 sec: 1138.5). Total num frames: 7553024. Throughput: 0: 282.7. Samples: 1888028. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 07:33:45,486][00034] Avg episode reward: [(0, '-1.713')] [2024-08-05 07:33:50,048][00139] DAMAGECOUNT value on done: 87915.0 [2024-08-05 07:33:50,049][00139] Sum rewards: -0.263, reward structure: {'DEATHCOUNT': '-7.500', 'AMMO5': '0.012', 'AMMO2': '0.015', 'AMMO4': '0.074', 'weapon5': '0.102', 'AMMO3': '0.118', 'WEAPON5': '0.250', 'HITCOUNT': '0.290', 'HEALTH': '0.444', 'WEAPON3': '0.500', 'DAMAGECOUNT': '1.200', 'weapon3': '1.348', 'weapon2': '1.384', 'FRAGCOUNT': '1.500'} [2024-08-05 07:33:50,285][00139] DAMAGECOUNT value on done: 92268.0 [2024-08-05 07:33:50,483][00034] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1110.8). Total num frames: 7553024. Throughput: 0: 281.4. Samples: 1889677. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 07:33:50,485][00034] Avg episode reward: [(0, '-1.749')] [2024-08-05 07:33:55,483][00034] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1110.8). Total num frames: 7561216. Throughput: 0: 281.1. Samples: 1891362. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 07:33:55,485][00034] Avg episode reward: [(0, '-1.749')] [2024-08-05 07:34:00,483][00034] Fps is (10 sec: 1638.4, 60 sec: 1228.8, 300 sec: 1138.5). Total num frames: 7569408. Throughput: 0: 281.0. Samples: 1892251. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 07:34:00,485][00034] Avg episode reward: [(0, '-1.749')] [2024-08-05 07:34:04,907][00139] DAMAGECOUNT value on done: 88130.0 [2024-08-05 07:34:04,907][00139] Sum rewards: -3.562, reward structure: {'DEATHCOUNT': '-9.750', 'HEALTH': '-1.410', 'AMMO4': '-0.029', 'AMMO2': '-0.006', 'weapon5': '0.002', 'AMMO5': '0.015', 'ARMOR': '0.048', 'WEAPON5': '0.150', 'AMMO3': '0.160', 'HITCOUNT': '0.210', 'DAMAGECOUNT': '0.645', 'WEAPON3': '0.900', 'weapon2': '1.662', 'weapon3': '1.840', 'FRAGCOUNT': '2.000'} [2024-08-05 07:34:05,124][00139] DAMAGECOUNT value on done: 92402.0 [2024-08-05 07:34:05,483][00034] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1110.8). Total num frames: 7569408. Throughput: 0: 281.2. Samples: 1893912. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 07:34:05,484][00034] Avg episode reward: [(0, '-1.772')] [2024-08-05 07:34:10,483][00034] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1110.8). Total num frames: 7577600. Throughput: 0: 281.5. Samples: 1895618. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 07:34:10,485][00034] Avg episode reward: [(0, '-1.772')] [2024-08-05 07:34:15,483][00034] Fps is (10 sec: 1638.4, 60 sec: 1092.3, 300 sec: 1138.5). Total num frames: 7585792. Throughput: 0: 281.1. Samples: 1896462. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 07:34:15,485][00034] Avg episode reward: [(0, '-1.772')] [2024-08-05 07:34:19,990][00139] DAMAGECOUNT value on done: 88438.0 [2024-08-05 07:34:19,991][00139] Sum rewards: -2.889, reward structure: {'DEATHCOUNT': '-9.000', 'HEALTH': '-3.108', 'AMMO4': '-0.111', 'AMMO2': '-0.022', 'AMMO5': '0.010', 'weapon5': '0.014', 'ARMOR': '0.056', 'WEAPON5': '0.100', 'AMMO3': '0.208', 'HITCOUNT': '0.270', 'DAMAGECOUNT': '0.924', 'WEAPON3': '1.150', 'weapon3': '1.678', 'weapon2': '1.942', 'FRAGCOUNT': '3.000'} [2024-08-05 07:34:20,232][00139] DAMAGECOUNT value on done: 92522.0 [2024-08-05 07:34:20,233][00139] Sum rewards: -2.802, reward structure: {'DEATHCOUNT': '-6.750', 'HEALTH': '-1.880', 'AMMO4': '-0.098', 'AMMO2': '-0.020', 'AMMO5': '0.013', 'ARMOR': '0.036', 'weapon5': '0.104', 'AMMO3': '0.109', 'HITCOUNT': '0.120', 'WEAPON5': '0.250', 'DAMAGECOUNT': '0.360', 'WEAPON3': '0.650', 'FRAGCOUNT': '1.000', 'weapon3': '1.314', 'weapon2': '1.990'} [2024-08-05 07:34:20,483][00034] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1110.8). Total num frames: 7585792. Throughput: 0: 280.5. Samples: 1898092. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 07:34:20,485][00034] Avg episode reward: [(0, '-1.722')] [2024-08-05 07:34:25,483][00034] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1110.8). Total num frames: 7593984. Throughput: 0: 280.6. Samples: 1899801. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 07:34:25,484][00034] Avg episode reward: [(0, '-1.722')] [2024-08-05 07:34:30,483][00034] Fps is (10 sec: 1638.4, 60 sec: 1092.3, 300 sec: 1138.5). Total num frames: 7602176. Throughput: 0: 280.3. Samples: 1900642. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 07:34:30,485][00034] Avg episode reward: [(0, '-1.722')] [2024-08-05 07:34:34,895][00139] DAMAGECOUNT value on done: 88568.0 [2024-08-05 07:34:34,895][00139] Sum rewards: -2.161, reward structure: {'DEATHCOUNT': '-6.750', 'HEALTH': '-1.493', 'AMMO4': '-0.001', 'AMMO2': '-0.000', 'AMMO5': '0.013', 'ARMOR': '0.020', 'WEAPON4': '0.100', 'weapon4': '0.100', 'HITCOUNT': '0.110', 'AMMO3': '0.121', 'weapon5': '0.184', 'WEAPON5': '0.250', 'DAMAGECOUNT': '0.390', 'WEAPON3': '0.800', 'FRAGCOUNT': '1.000', 'weapon2': '1.294', 'weapon3': '1.700'} [2024-08-05 07:34:35,155][00139] DAMAGECOUNT value on done: 92897.0 [2024-08-05 07:34:35,156][00139] Sum rewards: -4.539, reward structure: {'DEATHCOUNT': '-10.500', 'HEALTH': '-1.626', 'AMMO4': '-0.005', 'AMMO2': '-0.001', 'AMMO5': '0.020', 'weapon5': '0.066', 'AMMO3': '0.175', 'WEAPON5': '0.200', 'HITCOUNT': '0.280', 'ARMOR': '0.490', 'FRAGCOUNT': '0.500', 'WEAPON3': '1.050', 'DAMAGECOUNT': '1.125', 'weapon2': '1.426', 'weapon3': '2.260'} [2024-08-05 07:34:35,483][00034] Fps is (10 sec: 1638.4, 60 sec: 1228.8, 300 sec: 1138.5). Total num frames: 7610368. Throughput: 0: 281.0. Samples: 1902321. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 07:34:35,484][00034] Avg episode reward: [(0, '-1.791')] [2024-08-05 07:34:40,483][00034] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1110.8). Total num frames: 7610368. Throughput: 0: 281.1. Samples: 1904010. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 07:34:40,484][00034] Avg episode reward: [(0, '-1.791')] [2024-08-05 07:34:42,376][00138] Updated weights for policy 0, policy_version 930 (0.0018) [2024-08-05 07:34:45,483][00034] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1138.5). Total num frames: 7618560. Throughput: 0: 280.6. Samples: 1904876. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 07:34:45,485][00034] Avg episode reward: [(0, '-1.791')] [2024-08-05 07:34:49,735][00139] DAMAGECOUNT value on done: 88740.0 [2024-08-05 07:34:50,003][00139] DAMAGECOUNT value on done: 93332.0 [2024-08-05 07:34:50,004][00139] Sum rewards: -2.071, reward structure: {'DEATHCOUNT': '-10.500', 'HEALTH': '-1.258', 'AMMO2': '0.010', 'AMMO4': '0.050', 'WEAPON4': '0.050', 'weapon4': '0.056', 'ARMOR': '0.092', 'AMMO3': '0.164', 'HITCOUNT': '0.340', 'WEAPON3': '1.000', 'DAMAGECOUNT': '1.305', 'weapon2': '1.586', 'weapon3': '2.034', 'FRAGCOUNT': '3.000'} [2024-08-05 07:34:50,483][00034] Fps is (10 sec: 1638.4, 60 sec: 1228.8, 300 sec: 1138.5). Total num frames: 7626752. Throughput: 0: 281.9. Samples: 1906597. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 07:34:50,487][00034] Avg episode reward: [(0, '-1.852')] [2024-08-05 07:34:55,483][00034] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1110.8). Total num frames: 7626752. Throughput: 0: 280.8. Samples: 1908252. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 07:34:55,487][00034] Avg episode reward: [(0, '-1.852')] [2024-08-05 07:35:00,483][00034] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1110.8). Total num frames: 7634944. Throughput: 0: 280.4. Samples: 1909080. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 07:35:00,484][00034] Avg episode reward: [(0, '-1.852')] [2024-08-05 07:35:04,684][00139] DAMAGECOUNT value on done: 88986.0 [2024-08-05 07:35:04,685][00139] Sum rewards: -1.933, reward structure: {'DEATHCOUNT': '-9.000', 'HEALTH': '-0.646', 'AMMO4': '-0.006', 'AMMO2': '-0.001', 'AMMO5': '0.012', 'ARMOR': '0.045', 'weapon5': '0.080', 'AMMO3': '0.144', 'WEAPON5': '0.150', 'HITCOUNT': '0.180', 'DAMAGECOUNT': '0.738', 'WEAPON3': '0.900', 'weapon2': '1.140', 'FRAGCOUNT': '2.000', 'weapon3': '2.330'} [2024-08-05 07:35:04,931][00139] DAMAGECOUNT value on done: 93947.0 [2024-08-05 07:35:04,931][00139] Sum rewards: 5.509, reward structure: {'DEATHCOUNT': '-6.000', 'HEALTH': '-0.656', 'AMMO4': '-0.005', 'AMMO2': '-0.001', 'AMMO5': '0.030', 'weapon5': '0.104', 'AMMO3': '0.116', 'WEAPON5': '0.250', 'HITCOUNT': '0.410', 'ARMOR': '0.477', 'WEAPON3': '0.600', 'weapon2': '1.520', 'weapon3': '1.818', 'DAMAGECOUNT': '1.845', 'FRAGCOUNT': '5.000'} [2024-08-05 07:35:05,483][00034] Fps is (10 sec: 1638.4, 60 sec: 1228.8, 300 sec: 1138.5). Total num frames: 7643136. Throughput: 0: 282.2. Samples: 1910791. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 07:35:05,484][00034] Avg episode reward: [(0, '-1.818')] [2024-08-05 07:35:10,483][00034] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1110.8). Total num frames: 7643136. Throughput: 0: 281.7. Samples: 1912478. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 07:35:10,484][00034] Avg episode reward: [(0, '-1.818')] [2024-08-05 07:35:15,483][00034] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1110.8). Total num frames: 7651328. Throughput: 0: 281.8. Samples: 1913324. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 07:35:15,485][00034] Avg episode reward: [(0, '-1.818')] [2024-08-05 07:35:19,490][00139] DAMAGECOUNT value on done: 89275.0 [2024-08-05 07:35:19,490][00139] Sum rewards: 1.157, reward structure: {'DEATHCOUNT': '-6.750', 'HEALTH': '-1.266', 'AMMO4': '-0.027', 'AMMO2': '-0.005', 'AMMO5': '0.022', 'ARMOR': '0.060', 'weapon7': '0.064', 'weapon5': '0.096', 'AMMO3': '0.100', 'AMMO6': '0.120', 'AMMO7': '0.120', 'HITCOUNT': '0.170', 'WEAPON7': '0.200', 'WEAPON5': '0.350', 'WEAPON3': '0.550', 'DAMAGECOUNT': '0.867', 'weapon2': '1.612', 'weapon3': '1.874', 'FRAGCOUNT': '3.000'} [2024-08-05 07:35:19,736][00139] DAMAGECOUNT value on done: 94549.0 [2024-08-05 07:35:19,737][00139] Sum rewards: 1.677, reward structure: {'DEATHCOUNT': '-6.000', 'HEALTH': '-1.394', 'AMMO4': '-0.034', 'AMMO2': '-0.007', 'AMMO5': '0.005', 'ARMOR': '0.040', 'weapon7': '0.086', 'AMMO3': '0.098', 'WEAPON4': '0.100', 'WEAPON5': '0.100', 'weapon5': '0.118', 'AMMO6': '0.120', 'AMMO7': '0.120', 'weapon4': '0.198', 'WEAPON7': '0.200', 'HITCOUNT': '0.260', 'WEAPON3': '0.600', 'weapon2': '1.234', 'weapon3': '1.526', 'DAMAGECOUNT': '1.806', 'FRAGCOUNT': '2.500'} [2024-08-05 07:35:20,483][00034] Fps is (10 sec: 1638.4, 60 sec: 1228.8, 300 sec: 1138.5). Total num frames: 7659520. Throughput: 0: 282.6. Samples: 1915038. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 07:35:20,485][00034] Avg episode reward: [(0, '-1.735')] [2024-08-05 07:35:25,483][00034] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1110.8). Total num frames: 7659520. Throughput: 0: 281.3. Samples: 1916670. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 07:35:25,484][00034] Avg episode reward: [(0, '-1.735')] [2024-08-05 07:35:30,483][00034] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1110.8). Total num frames: 7667712. Throughput: 0: 280.0. Samples: 1917478. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 07:35:30,485][00034] Avg episode reward: [(0, '-1.735')] [2024-08-05 07:35:34,867][00139] DAMAGECOUNT value on done: 89558.0 [2024-08-05 07:35:34,868][00139] Sum rewards: -1.882, reward structure: {'DEATHCOUNT': '-8.250', 'HEALTH': '-1.335', 'AMMO5': '0.003', 'AMMO2': '0.010', 'ARMOR': '0.040', 'AMMO4': '0.052', 'AMMO3': '0.167', 'WEAPON4': '0.200', 'weapon4': '0.246', 'HITCOUNT': '0.250', 'DAMAGECOUNT': '0.849', 'WEAPON3': '0.950', 'weapon2': '1.378', 'weapon3': '1.558', 'FRAGCOUNT': '2.000'} [2024-08-05 07:35:35,118][00139] DAMAGECOUNT value on done: 94578.0 [2024-08-05 07:35:35,483][00034] Fps is (10 sec: 1638.4, 60 sec: 1092.3, 300 sec: 1138.5). Total num frames: 7675904. Throughput: 0: 278.6. Samples: 1919134. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 07:35:35,484][00034] Avg episode reward: [(0, '-1.831')] [2024-08-05 07:35:35,492][00132] Saving /kaggle/working/train_dir/default_experiment/checkpoint_p0/checkpoint_000000937_7675904.pth... [2024-08-05 07:35:35,579][00132] Removing /kaggle/working/train_dir/default_experiment/checkpoint_p0/checkpoint_000000904_7405568.pth [2024-08-05 07:35:40,483][00034] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1110.8). Total num frames: 7675904. Throughput: 0: 279.0. Samples: 1920807. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 07:35:40,484][00034] Avg episode reward: [(0, '-1.831')] [2024-08-05 07:35:45,483][00034] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1110.8). Total num frames: 7684096. Throughput: 0: 279.8. Samples: 1921672. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 07:35:45,484][00034] Avg episode reward: [(0, '-1.831')] [2024-08-05 07:35:49,861][00139] DAMAGECOUNT value on done: 89817.0 [2024-08-05 07:35:49,862][00139] Sum rewards: -3.247, reward structure: {'DEATHCOUNT': '-9.750', 'HEALTH': '-1.544', 'AMMO4': '-0.085', 'AMMO2': '-0.017', 'AMMO5': '0.005', 'AMMO3': '0.135', 'HITCOUNT': '0.220', 'ARMOR': '0.516', 'DAMAGECOUNT': '0.777', 'WEAPON3': '0.900', 'weapon2': '1.748', 'weapon3': '1.848', 'FRAGCOUNT': '2.000'} [2024-08-05 07:35:50,078][00139] DAMAGECOUNT value on done: 94685.0 [2024-08-05 07:35:50,483][00034] Fps is (10 sec: 1638.4, 60 sec: 1092.3, 300 sec: 1138.5). Total num frames: 7692288. Throughput: 0: 278.8. Samples: 1923338. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 07:35:50,485][00034] Avg episode reward: [(0, '-1.810')] [2024-08-05 07:35:55,483][00034] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1110.8). Total num frames: 7692288. Throughput: 0: 278.0. Samples: 1924989. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 07:35:55,484][00034] Avg episode reward: [(0, '-1.810')] [2024-08-05 07:35:55,701][00138] Updated weights for policy 0, policy_version 940 (0.0017) [2024-08-05 07:36:00,483][00034] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1110.8). Total num frames: 7700480. Throughput: 0: 278.6. Samples: 1925861. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 07:36:00,484][00034] Avg episode reward: [(0, '-1.810')] [2024-08-05 07:36:04,806][00139] DAMAGECOUNT value on done: 89987.0 [2024-08-05 07:36:04,807][00139] Sum rewards: -5.539, reward structure: {'DEATHCOUNT': '-11.250', 'HEALTH': '-1.660', 'AMMO4': '-0.040', 'AMMO2': '-0.008', 'AMMO5': '0.014', 'weapon5': '0.082', 'weapon4': '0.092', 'WEAPON4': '0.100', 'HITCOUNT': '0.180', 'WEAPON5': '0.200', 'AMMO3': '0.203', 'DAMAGECOUNT': '0.510', 'ARMOR': '0.584', 'WEAPON3': '1.000', 'FRAGCOUNT': '1.000', 'weapon3': '1.586', 'weapon2': '1.868'} [2024-08-05 07:36:05,055][00139] DAMAGECOUNT value on done: 95150.0 [2024-08-05 07:36:05,055][00139] Sum rewards: -5.826, reward structure: {'DEATHCOUNT': '-14.250', 'HEALTH': '-2.084', 'AMMO4': '-0.013', 'AMMO2': '-0.003', 'AMMO5': '0.013', 'ARMOR': '0.028', 'WEAPON4': '0.050', 'weapon4': '0.090', 'weapon5': '0.094', 'AMMO3': '0.228', 'WEAPON5': '0.250', 'HITCOUNT': '0.460', 'WEAPON3': '1.350', 'DAMAGECOUNT': '1.395', 'weapon2': '1.444', 'weapon3': '2.122', 'FRAGCOUNT': '3.000'} [2024-08-05 07:36:05,483][00034] Fps is (10 sec: 1638.4, 60 sec: 1092.3, 300 sec: 1138.5). Total num frames: 7708672. Throughput: 0: 277.8. Samples: 1927539. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 07:36:05,486][00034] Avg episode reward: [(0, '-1.885')] [2024-08-05 07:36:10,483][00034] Fps is (10 sec: 1638.4, 60 sec: 1228.8, 300 sec: 1138.5). Total num frames: 7716864. Throughput: 0: 278.2. Samples: 1929187. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 07:36:10,485][00034] Avg episode reward: [(0, '-1.885')] [2024-08-05 07:36:15,483][00034] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1110.8). Total num frames: 7716864. Throughput: 0: 279.1. Samples: 1930038. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 07:36:15,484][00034] Avg episode reward: [(0, '-1.885')] [2024-08-05 07:36:19,860][00139] DAMAGECOUNT value on done: 90122.0 [2024-08-05 07:36:19,861][00139] Sum rewards: 1.017, reward structure: {'DEATHCOUNT': '-3.000', 'HEALTH': '-0.090', 'AMMO4': '-0.010', 'AMMO2': '-0.002', 'AMMO5': '0.010', 'weapon5': '0.030', 'AMMO3': '0.090', 'HITCOUNT': '0.150', 'WEAPON5': '0.200', 'WEAPON3': '0.350', 'DAMAGECOUNT': '0.405', 'FRAGCOUNT': '0.500', 'weapon2': '0.698', 'weapon3': '1.686'} [2024-08-05 07:36:20,099][00139] DAMAGECOUNT value on done: 95382.0 [2024-08-05 07:36:20,099][00139] Sum rewards: -5.573, reward structure: {'DEATHCOUNT': '-12.750', 'HEALTH': '-1.838', 'AMMO4': '-0.010', 'AMMO2': '-0.002', 'AMMO5': '0.010', 'ARMOR': '0.064', 'weapon5': '0.070', 'WEAPON5': '0.150', 'AMMO3': '0.187', 'HITCOUNT': '0.190', 'DAMAGECOUNT': '0.696', 'WEAPON3': '1.000', 'weapon3': '1.692', 'weapon2': '1.968', 'FRAGCOUNT': '3.000'} [2024-08-05 07:36:20,483][00034] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1138.5). Total num frames: 7725056. Throughput: 0: 279.8. Samples: 1931724. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 07:36:20,485][00034] Avg episode reward: [(0, '-1.924')] [2024-08-05 07:36:25,483][00034] Fps is (10 sec: 1638.4, 60 sec: 1228.8, 300 sec: 1138.5). Total num frames: 7733248. Throughput: 0: 279.3. Samples: 1933374. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 07:36:25,485][00034] Avg episode reward: [(0, '-1.924')] [2024-08-05 07:36:30,483][00034] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1110.8). Total num frames: 7733248. Throughput: 0: 279.2. Samples: 1934236. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 07:36:30,485][00034] Avg episode reward: [(0, '-1.924')] [2024-08-05 07:36:34,833][00139] DAMAGECOUNT value on done: 90451.0 [2024-08-05 07:36:34,833][00139] Sum rewards: -1.567, reward structure: {'DEATHCOUNT': '-9.000', 'HEALTH': '-1.056', 'AMMO4': '-0.007', 'AMMO2': '-0.001', 'ARMOR': '0.004', 'AMMO5': '0.010', 'weapon7': '0.066', 'AMMO6': '0.120', 'AMMO7': '0.120', 'weapon5': '0.134', 'WEAPON5': '0.150', 'AMMO3': '0.162', 'HITCOUNT': '0.190', 'WEAPON7': '0.200', 'WEAPON3': '0.800', 'DAMAGECOUNT': '0.987', 'weapon2': '1.600', 'weapon3': '1.954', 'FRAGCOUNT': '2.000'} [2024-08-05 07:36:35,078][00139] DAMAGECOUNT value on done: 96168.0 [2024-08-05 07:36:35,078][00139] Sum rewards: 3.332, reward structure: {'DEATHCOUNT': '-8.250', 'HEALTH': '-2.756', 'WEAPON1': '0.010', 'AMMO5': '0.017', 'AMMO2': '0.019', 'AMMO4': '0.097', 'AMMO3': '0.134', 'weapon5': '0.228', 'WEAPON5': '0.350', 'ARMOR': '0.505', 'HITCOUNT': '0.540', 'WEAPON3': '1.050', 'weapon2': '1.382', 'weapon3': '2.148', 'DAMAGECOUNT': '2.358', 'FRAGCOUNT': '5.500'} [2024-08-05 07:36:35,483][00034] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1110.8). Total num frames: 7741440. Throughput: 0: 280.0. Samples: 1935938. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 07:36:35,486][00034] Avg episode reward: [(0, '-1.917')] [2024-08-05 07:36:40,483][00034] Fps is (10 sec: 1638.4, 60 sec: 1228.8, 300 sec: 1138.5). Total num frames: 7749632. Throughput: 0: 280.4. Samples: 1937607. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 07:36:40,484][00034] Avg episode reward: [(0, '-1.917')] [2024-08-05 07:36:45,483][00034] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1110.8). Total num frames: 7749632. Throughput: 0: 280.4. Samples: 1938481. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 07:36:45,485][00034] Avg episode reward: [(0, '-1.917')] [2024-08-05 07:36:49,795][00139] DAMAGECOUNT value on done: 90736.0 [2024-08-05 07:36:49,796][00139] Sum rewards: -2.485, reward structure: {'DEATHCOUNT': '-9.750', 'HEALTH': '-1.626', 'AMMO4': '-0.026', 'AMMO2': '-0.005', 'AMMO5': '0.007', 'weapon5': '0.014', 'ARMOR': '0.064', 'WEAPON5': '0.150', 'AMMO3': '0.220', 'HITCOUNT': '0.230', 'DAMAGECOUNT': '0.855', 'WEAPON3': '1.100', 'weapon2': '1.372', 'weapon3': '1.910', 'FRAGCOUNT': '3.000'} [2024-08-05 07:36:50,049][00139] DAMAGECOUNT value on done: 96448.0 [2024-08-05 07:36:50,049][00139] Sum rewards: 0.186, reward structure: {'DEATHCOUNT': '-7.500', 'HEALTH': '-0.219', 'AMMO5': '0.007', 'AMMO2': '0.017', 'weapon4': '0.018', 'WEAPON4': '0.050', 'ARMOR': '0.072', 'weapon5': '0.074', 'AMMO4': '0.084', 'WEAPON5': '0.100', 'AMMO3': '0.126', 'HITCOUNT': '0.230', 'WEAPON3': '0.750', 'DAMAGECOUNT': '0.840', 'weapon2': '1.240', 'FRAGCOUNT': '2.000', 'weapon3': '2.296'} [2024-08-05 07:36:50,483][00034] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1110.8). Total num frames: 7757824. Throughput: 0: 280.2. Samples: 1940150. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 07:36:50,484][00034] Avg episode reward: [(0, '-1.904')] [2024-08-05 07:36:55,483][00034] Fps is (10 sec: 1638.4, 60 sec: 1228.8, 300 sec: 1138.5). Total num frames: 7766016. Throughput: 0: 281.2. Samples: 1941841. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 07:36:55,485][00034] Avg episode reward: [(0, '-1.904')] [2024-08-05 07:37:00,483][00034] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1110.8). Total num frames: 7766016. Throughput: 0: 280.2. Samples: 1942645. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 07:37:00,484][00034] Avg episode reward: [(0, '-1.904')] [2024-08-05 07:37:04,803][00139] DAMAGECOUNT value on done: 91067.0 [2024-08-05 07:37:04,804][00139] Sum rewards: -3.042, reward structure: {'DEATHCOUNT': '-9.750', 'HEALTH': '-1.116', 'AMMO2': '0.006', 'AMMO5': '0.007', 'AMMO4': '0.030', 'ARMOR': '0.048', 'weapon5': '0.058', 'weapon4': '0.066', 'AMMO6': '0.100', 'WEAPON7': '0.100', 'AMMO7': '0.100', 'WEAPON4': '0.100', 'AMMO3': '0.156', 'WEAPON5': '0.200', 'HITCOUNT': '0.290', 'WEAPON3': '0.900', 'DAMAGECOUNT': '0.993', 'FRAGCOUNT': '1.000', 'weapon2': '1.280', 'weapon3': '2.390'} [2024-08-05 07:37:05,036][00139] DAMAGECOUNT value on done: 96738.0 [2024-08-05 07:37:05,036][00139] Sum rewards: -2.771, reward structure: {'DEATHCOUNT': '-9.750', 'HEALTH': '-1.781', 'AMMO4': '-0.004', 'AMMO2': '-0.001', 'weapon4': '0.028', 'ARMOR': '0.032', 'WEAPON4': '0.050', 'AMMO3': '0.138', 'HITCOUNT': '0.220', 'WEAPON3': '0.650', 'DAMAGECOUNT': '0.870', 'weapon3': '1.114', 'weapon2': '2.662', 'FRAGCOUNT': '3.000'} [2024-08-05 07:37:05,483][00034] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1110.8). Total num frames: 7774208. Throughput: 0: 280.4. Samples: 1944344. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 07:37:05,485][00034] Avg episode reward: [(0, '-1.947')] [2024-08-05 07:37:08,584][00138] Updated weights for policy 0, policy_version 950 (0.0017) [2024-08-05 07:37:10,483][00034] Fps is (10 sec: 1638.4, 60 sec: 1092.3, 300 sec: 1138.5). Total num frames: 7782400. Throughput: 0: 281.3. Samples: 1946034. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 07:37:10,484][00034] Avg episode reward: [(0, '-1.947')] [2024-08-05 07:37:15,483][00034] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1110.8). Total num frames: 7782400. Throughput: 0: 281.3. Samples: 1946895. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 07:37:15,484][00034] Avg episode reward: [(0, '-1.947')] [2024-08-05 07:37:19,777][00139] DAMAGECOUNT value on done: 91278.0 [2024-08-05 07:37:19,778][00139] Sum rewards: -0.023, reward structure: {'DEATHCOUNT': '-6.000', 'HEALTH': '-0.875', 'AMMO4': '-0.014', 'AMMO2': '-0.003', 'AMMO5': '0.005', 'ARMOR': '0.020', 'WEAPON4': '0.100', 'WEAPON5': '0.100', 'AMMO3': '0.118', 'HITCOUNT': '0.160', 'weapon4': '0.168', 'DAMAGECOUNT': '0.633', 'WEAPON3': '0.700', 'weapon2': '0.942', 'weapon3': '1.922', 'FRAGCOUNT': '2.000'} [2024-08-05 07:37:20,008][00139] DAMAGECOUNT value on done: 97133.0 [2024-08-05 07:37:20,008][00139] Sum rewards: -1.359, reward structure: {'DEATHCOUNT': '-9.000', 'HEALTH': '-0.300', 'AMMO4': '-0.001', 'AMMO2': '-0.000', 'AMMO5': '0.005', 'ARMOR': '0.072', 'weapon5': '0.076', 'WEAPON5': '0.100', 'AMMO3': '0.112', 'HITCOUNT': '0.300', 'WEAPON3': '0.600', 'DAMAGECOUNT': '1.185', 'weapon3': '1.406', 'weapon2': '1.586', 'FRAGCOUNT': '2.500'} [2024-08-05 07:37:20,483][00034] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1110.8). Total num frames: 7790592. Throughput: 0: 280.2. Samples: 1948547. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 07:37:20,485][00034] Avg episode reward: [(0, '-1.924')] [2024-08-05 07:37:25,483][00034] Fps is (10 sec: 1638.4, 60 sec: 1092.3, 300 sec: 1138.5). Total num frames: 7798784. Throughput: 0: 280.6. Samples: 1950233. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 07:37:25,484][00034] Avg episode reward: [(0, '-1.924')] [2024-08-05 07:37:30,483][00034] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1110.8). Total num frames: 7798784. Throughput: 0: 278.1. Samples: 1950997. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 07:37:30,486][00034] Avg episode reward: [(0, '-1.924')] [2024-08-05 07:37:34,933][00139] DAMAGECOUNT value on done: 91728.0 [2024-08-05 07:37:34,934][00139] Sum rewards: 4.026, reward structure: {'DEATHCOUNT': '-8.250', 'HEALTH': '-0.360', 'AMMO5': '0.005', 'AMMO2': '0.006', 'weapon5': '0.030', 'AMMO4': '0.032', 'WEAPON5': '0.050', 'ARMOR': '0.072', 'AMMO3': '0.133', 'HITCOUNT': '0.380', 'WEAPON3': '0.850', 'DAMAGECOUNT': '1.350', 'weapon2': '1.500', 'weapon3': '2.228', 'FRAGCOUNT': '6.000'} [2024-08-05 07:37:35,177][00139] DAMAGECOUNT value on done: 97732.0 [2024-08-05 07:37:35,178][00139] Sum rewards: 3.002, reward structure: {'DEATHCOUNT': '-6.000', 'HEALTH': '-0.734', 'AMMO2': '0.011', 'AMMO5': '0.018', 'AMMO4': '0.054', 'ARMOR': '0.072', 'AMMO3': '0.107', 'weapon5': '0.134', 'WEAPON5': '0.250', 'HITCOUNT': '0.390', 'WEAPON3': '0.700', 'weapon2': '1.042', 'DAMAGECOUNT': '1.797', 'weapon3': '2.162', 'FRAGCOUNT': '3.000'} [2024-08-05 07:37:35,483][00034] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1110.8). Total num frames: 7806976. Throughput: 0: 278.9. Samples: 1952700. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 07:37:35,484][00034] Avg episode reward: [(0, '-1.795')] [2024-08-05 07:37:35,493][00132] Saving /kaggle/working/train_dir/default_experiment/checkpoint_p0/checkpoint_000000953_7806976.pth... [2024-08-05 07:37:35,565][00132] Removing /kaggle/working/train_dir/default_experiment/checkpoint_p0/checkpoint_000000920_7536640.pth [2024-08-05 07:37:40,483][00034] Fps is (10 sec: 1638.4, 60 sec: 1092.3, 300 sec: 1138.5). Total num frames: 7815168. Throughput: 0: 279.4. Samples: 1954414. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 07:37:40,484][00034] Avg episode reward: [(0, '-1.795')] [2024-08-05 07:37:45,483][00034] Fps is (10 sec: 1638.4, 60 sec: 1228.8, 300 sec: 1138.5). Total num frames: 7823360. Throughput: 0: 281.0. Samples: 1955289. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 07:37:45,484][00034] Avg episode reward: [(0, '-1.795')] [2024-08-05 07:37:49,669][00139] DAMAGECOUNT value on done: 92043.0 [2024-08-05 07:37:49,669][00139] Sum rewards: 1.756, reward structure: {'DEATHCOUNT': '-9.000', 'HEALTH': '-0.915', 'AMMO2': '0.007', 'WEAPON1': '0.010', 'AMMO5': '0.018', 'AMMO4': '0.035', 'WEAPON4': '0.050', 'weapon5': '0.080', 'AMMO3': '0.156', 'HITCOUNT': '0.230', 'WEAPON5': '0.250', 'ARMOR': '0.483', 'WEAPON3': '0.800', 'DAMAGECOUNT': '0.945', 'weapon3': '1.788', 'weapon2': '1.818', 'FRAGCOUNT': '5.000'} [2024-08-05 07:37:49,892][00139] DAMAGECOUNT value on done: 98287.0 [2024-08-05 07:37:49,893][00139] Sum rewards: -5.111, reward structure: {'DEATHCOUNT': '-12.750', 'HEALTH': '-1.411', 'AMMO5': '0.005', 'AMMO2': '0.005', 'AMMO4': '0.026', 'ARMOR': '0.028', 'WEAPON5': '0.050', 'AMMO3': '0.192', 'HITCOUNT': '0.390', 'WEAPON3': '1.050', 'DAMAGECOUNT': '1.665', 'weapon2': '1.668', 'weapon3': '1.970', 'FRAGCOUNT': '2.000'} [2024-08-05 07:37:50,483][00034] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1110.8). Total num frames: 7823360. Throughput: 0: 280.8. Samples: 1956980. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 07:37:50,484][00034] Avg episode reward: [(0, '-1.760')] [2024-08-05 07:37:55,483][00034] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1138.5). Total num frames: 7831552. Throughput: 0: 280.9. Samples: 1958673. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 07:37:55,485][00034] Avg episode reward: [(0, '-1.760')] [2024-08-05 07:38:00,483][00034] Fps is (10 sec: 1638.4, 60 sec: 1228.8, 300 sec: 1138.5). Total num frames: 7839744. Throughput: 0: 280.7. Samples: 1959527. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 07:38:00,486][00034] Avg episode reward: [(0, '-1.760')] [2024-08-05 07:38:04,748][00139] DAMAGECOUNT value on done: 92246.0 [2024-08-05 07:38:04,748][00139] Sum rewards: 0.209, reward structure: {'DEATHCOUNT': '-4.500', 'HEALTH': '-0.072', 'AMMO4': '-0.029', 'AMMO2': '-0.006', 'AMMO5': '0.010', 'weapon7': '0.056', 'HITCOUNT': '0.070', 'AMMO3': '0.085', 'weapon5': '0.086', 'WEAPON5': '0.100', 'AMMO6': '0.120', 'AMMO7': '0.120', 'WEAPON7': '0.200', 'WEAPON3': '0.500', 'DAMAGECOUNT': '0.609', 'weapon2': '0.724', 'FRAGCOUNT': '1.000', 'weapon3': '1.136'} [2024-08-05 07:38:04,971][00139] DAMAGECOUNT value on done: 98465.0 [2024-08-05 07:38:05,483][00034] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1110.8). Total num frames: 7839744. Throughput: 0: 280.3. Samples: 1961159. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 07:38:05,484][00034] Avg episode reward: [(0, '-1.781')] [2024-08-05 07:38:10,483][00034] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1110.8). Total num frames: 7847936. Throughput: 0: 280.4. Samples: 1962850. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 07:38:10,485][00034] Avg episode reward: [(0, '-1.781')] [2024-08-05 07:38:15,483][00034] Fps is (10 sec: 1638.4, 60 sec: 1228.8, 300 sec: 1138.5). Total num frames: 7856128. Throughput: 0: 282.4. Samples: 1963707. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 07:38:15,485][00034] Avg episode reward: [(0, '-1.781')] [2024-08-05 07:38:19,571][00139] DAMAGECOUNT value on done: 92529.0 [2024-08-05 07:38:19,571][00139] Sum rewards: 0.018, reward structure: {'DEATHCOUNT': '-8.250', 'HEALTH': '-1.298', 'AMMO2': '0.002', 'weapon4': '0.006', 'AMMO5': '0.007', 'AMMO4': '0.011', 'WEAPON4': '0.050', 'weapon5': '0.070', 'AMMO3': '0.120', 'WEAPON5': '0.150', 'HITCOUNT': '0.260', 'ARMOR': '0.524', 'WEAPON3': '0.800', 'DAMAGECOUNT': '0.849', 'weapon2': '1.466', 'weapon3': '2.250', 'FRAGCOUNT': '3.000'} [2024-08-05 07:38:19,789][00139] DAMAGECOUNT value on done: 98685.0 [2024-08-05 07:38:19,790][00139] Sum rewards: -3.165, reward structure: {'DEATHCOUNT': '-6.750', 'HEALTH': '-1.432', 'FRAGCOUNT': '-0.500', 'AMMO4': '-0.001', 'AMMO2': '-0.000', 'AMMO5': '0.015', 'weapon4': '0.038', 'ARMOR': '0.080', 'WEAPON4': '0.100', 'AMMO3': '0.123', 'weapon5': '0.168', 'HITCOUNT': '0.170', 'WEAPON5': '0.200', 'DAMAGECOUNT': '0.660', 'WEAPON3': '0.700', 'weapon3': '1.600', 'weapon2': '1.664'} [2024-08-05 07:38:20,483][00034] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1110.8). Total num frames: 7856128. Throughput: 0: 282.4. Samples: 1965409. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 07:38:20,485][00034] Avg episode reward: [(0, '-1.755')] [2024-08-05 07:38:21,551][00138] Updated weights for policy 0, policy_version 960 (0.0017) [2024-08-05 07:38:25,483][00034] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1110.8). Total num frames: 7864320. Throughput: 0: 282.4. Samples: 1967121. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 07:38:25,485][00034] Avg episode reward: [(0, '-1.755')] [2024-08-05 07:38:30,483][00034] Fps is (10 sec: 1638.4, 60 sec: 1228.8, 300 sec: 1138.5). Total num frames: 7872512. Throughput: 0: 282.4. Samples: 1967998. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 07:38:30,486][00034] Avg episode reward: [(0, '-1.755')] [2024-08-05 07:38:34,391][00139] DAMAGECOUNT value on done: 92960.0 [2024-08-05 07:38:34,392][00139] Sum rewards: 1.810, reward structure: {'DEATHCOUNT': '-8.250', 'HEALTH': '-0.006', 'AMMO5': '0.007', 'AMMO2': '0.017', 'AMMO4': '0.086', 'WEAPON5': '0.100', 'ARMOR': '0.108', 'weapon5': '0.116', 'AMMO3': '0.120', 'HITCOUNT': '0.200', 'WEAPON3': '0.500', 'DAMAGECOUNT': '1.293', 'weapon3': '1.734', 'weapon2': '1.784', 'FRAGCOUNT': '4.000'} [2024-08-05 07:38:34,622][00139] DAMAGECOUNT value on done: 99087.0 [2024-08-05 07:38:34,622][00139] Sum rewards: 1.342, reward structure: {'DEATHCOUNT': '-6.000', 'HEALTH': '-1.129', 'AMMO2': '0.006', 'AMMO5': '0.014', 'weapon5': '0.022', 'AMMO4': '0.028', 'ARMOR': '0.096', 'WEAPON4': '0.100', 'AMMO3': '0.103', 'weapon4': '0.144', 'WEAPON5': '0.300', 'HITCOUNT': '0.310', 'WEAPON3': '0.600', 'DAMAGECOUNT': '1.206', 'weapon2': '1.476', 'weapon3': '1.566', 'FRAGCOUNT': '2.500'} [2024-08-05 07:38:35,483][00034] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1110.8). Total num frames: 7872512. Throughput: 0: 281.8. Samples: 1969662. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 07:38:35,484][00034] Avg episode reward: [(0, '-1.777')] [2024-08-05 07:38:40,483][00034] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1110.8). Total num frames: 7880704. Throughput: 0: 282.3. Samples: 1971375. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 07:38:40,484][00034] Avg episode reward: [(0, '-1.777')] [2024-08-05 07:38:45,483][00034] Fps is (10 sec: 1638.4, 60 sec: 1092.3, 300 sec: 1138.5). Total num frames: 7888896. Throughput: 0: 282.2. Samples: 1972225. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 07:38:45,484][00034] Avg episode reward: [(0, '-1.777')] [2024-08-05 07:38:49,322][00139] DAMAGECOUNT value on done: 93409.0 [2024-08-05 07:38:49,323][00139] Sum rewards: 0.375, reward structure: {'DEATHCOUNT': '-9.000', 'HEALTH': '-0.200', 'AMMO5': '0.007', 'AMMO2': '0.020', 'weapon5': '0.040', 'ARMOR': '0.072', 'WEAPON5': '0.100', 'AMMO4': '0.102', 'AMMO3': '0.112', 'WEAPON4': '0.150', 'weapon4': '0.234', 'HITCOUNT': '0.360', 'WEAPON3': '0.700', 'DAMAGECOUNT': '1.347', 'weapon2': '1.522', 'weapon3': '1.808', 'FRAGCOUNT': '3.000'} [2024-08-05 07:38:49,542][00139] DAMAGECOUNT value on done: 99327.0 [2024-08-05 07:38:49,542][00139] Sum rewards: 0.457, reward structure: {'DEATHCOUNT': '-7.500', 'HEALTH': '-0.580', 'AMMO5': '0.003', 'AMMO2': '0.013', 'ARMOR': '0.036', 'weapon5': '0.048', 'AMMO4': '0.065', 'WEAPON5': '0.100', 'weapon7': '0.100', 'AMMO3': '0.120', 'AMMO6': '0.120', 'AMMO7': '0.120', 'WEAPON4': '0.150', 'HITCOUNT': '0.180', 'WEAPON7': '0.200', 'weapon4': '0.214', 'DAMAGECOUNT': '0.720', 'WEAPON3': '0.750', 'weapon2': '1.270', 'weapon3': '1.328', 'FRAGCOUNT': '3.000'} [2024-08-05 07:38:50,483][00034] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1110.8). Total num frames: 7888896. Throughput: 0: 282.6. Samples: 1973878. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 07:38:50,484][00034] Avg episode reward: [(0, '-1.771')] [2024-08-05 07:38:55,483][00034] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1110.8). Total num frames: 7897088. Throughput: 0: 282.9. Samples: 1975582. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 07:38:55,484][00034] Avg episode reward: [(0, '-1.771')] [2024-08-05 07:39:00,483][00034] Fps is (10 sec: 1638.4, 60 sec: 1092.3, 300 sec: 1138.5). Total num frames: 7905280. Throughput: 0: 283.4. Samples: 1976459. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 07:39:00,484][00034] Avg episode reward: [(0, '-1.771')] [2024-08-05 07:39:04,195][00139] DAMAGECOUNT value on done: 93687.0 [2024-08-05 07:39:04,195][00139] Sum rewards: 1.675, reward structure: {'DEATHCOUNT': '-6.750', 'AMMO5': '0.010', 'AMMO2': '0.017', 'weapon4': '0.034', 'WEAPON4': '0.050', 'AMMO4': '0.086', 'AMMO3': '0.120', 'weapon5': '0.154', 'HITCOUNT': '0.180', 'WEAPON5': '0.200', 'HEALTH': '0.268', 'ARMOR': '0.440', 'WEAPON3': '0.550', 'DAMAGECOUNT': '0.834', 'weapon2': '1.682', 'weapon3': '1.800', 'FRAGCOUNT': '2.000'} [2024-08-05 07:39:04,423][00139] DAMAGECOUNT value on done: 100002.0 [2024-08-05 07:39:04,423][00139] Sum rewards: -4.126, reward structure: {'DEATHCOUNT': '-12.750', 'HEALTH': '-2.620', 'AMMO2': '0.010', 'AMMO5': '0.025', 'WEAPON4': '0.050', 'AMMO4': '0.051', 'weapon5': '0.082', 'ARMOR': '0.112', 'weapon4': '0.202', 'AMMO3': '0.221', 'HITCOUNT': '0.300', 'WEAPON5': '0.300', 'WEAPON3': '1.200', 'weapon2': '1.422', 'weapon3': '1.744', 'DAMAGECOUNT': '2.025', 'FRAGCOUNT': '3.500'} [2024-08-05 07:39:05,483][00034] Fps is (10 sec: 1638.4, 60 sec: 1228.8, 300 sec: 1138.5). Total num frames: 7913472. Throughput: 0: 282.4. Samples: 1978119. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 07:39:05,484][00034] Avg episode reward: [(0, '-1.738')] [2024-08-05 07:39:10,483][00034] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1110.8). Total num frames: 7913472. Throughput: 0: 282.5. Samples: 1979832. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 07:39:10,485][00034] Avg episode reward: [(0, '-1.738')] [2024-08-05 07:39:15,483][00034] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1138.5). Total num frames: 7921664. Throughput: 0: 281.9. Samples: 1980683. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 07:39:15,485][00034] Avg episode reward: [(0, '-1.738')] [2024-08-05 07:39:18,950][00139] DAMAGECOUNT value on done: 94093.0 [2024-08-05 07:39:18,951][00139] Sum rewards: -5.004, reward structure: {'DEATHCOUNT': '-11.250', 'HEALTH': '-1.236', 'AMMO5': '0.007', 'AMMO2': '0.008', 'ARMOR': '0.032', 'AMMO4': '0.041', 'weapon5': '0.090', 'AMMO3': '0.129', 'WEAPON5': '0.150', 'WEAPON4': '0.150', 'weapon4': '0.184', 'HITCOUNT': '0.270', 'WEAPON3': '0.850', 'FRAGCOUNT': '1.000', 'DAMAGECOUNT': '1.218', 'weapon3': '1.520', 'weapon2': '1.832'} [2024-08-05 07:39:19,198][00139] DAMAGECOUNT value on done: 100145.0 [2024-08-05 07:39:20,483][00034] Fps is (10 sec: 1638.4, 60 sec: 1228.8, 300 sec: 1138.5). Total num frames: 7929856. Throughput: 0: 282.7. Samples: 1982382. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 07:39:20,485][00034] Avg episode reward: [(0, '-1.831')] [2024-08-05 07:39:25,483][00034] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1110.8). Total num frames: 7929856. Throughput: 0: 282.8. Samples: 1984102. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 07:39:25,484][00034] Avg episode reward: [(0, '-1.831')] [2024-08-05 07:39:30,483][00034] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1110.8). Total num frames: 7938048. Throughput: 0: 282.6. Samples: 1984942. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 07:39:30,484][00034] Avg episode reward: [(0, '-1.831')] [2024-08-05 07:39:33,734][00139] DAMAGECOUNT value on done: 94580.0 [2024-08-05 07:39:33,735][00139] Sum rewards: -1.023, reward structure: {'DEATHCOUNT': '-9.750', 'HEALTH': '-0.794', 'AMMO2': '0.010', 'AMMO5': '0.010', 'AMMO4': '0.048', 'weapon5': '0.062', 'ARMOR': '0.076', 'AMMO3': '0.138', 'WEAPON5': '0.150', 'WEAPON4': '0.150', 'weapon4': '0.192', 'HITCOUNT': '0.370', 'WEAPON3': '0.850', 'weapon2': '1.214', 'DAMAGECOUNT': '1.461', 'weapon3': '1.790', 'FRAGCOUNT': '3.000'} [2024-08-05 07:39:33,941][00138] Updated weights for policy 0, policy_version 970 (0.0017) [2024-08-05 07:39:34,054][00139] DAMAGECOUNT value on done: 100339.0 [2024-08-05 07:39:34,055][00139] Sum rewards: -3.577, reward structure: {'DEATHCOUNT': '-8.250', 'HEALTH': '-1.501', 'AMMO2': '0.005', 'AMMO5': '0.010', 'AMMO4': '0.027', 'ARMOR': '0.040', 'WEAPON4': '0.050', 'weapon5': '0.076', 'weapon4': '0.082', 'HITCOUNT': '0.100', 'AMMO6': '0.100', 'WEAPON7': '0.100', 'AMMO7': '0.100', 'AMMO3': '0.118', 'WEAPON5': '0.250', 'FRAGCOUNT': '0.500', 'DAMAGECOUNT': '0.582', 'WEAPON3': '0.800', 'weapon3': '1.358', 'weapon2': '1.876'} [2024-08-05 07:39:35,483][00034] Fps is (10 sec: 1638.4, 60 sec: 1228.8, 300 sec: 1138.5). Total num frames: 7946240. Throughput: 0: 283.0. Samples: 1986615. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 07:39:35,485][00034] Avg episode reward: [(0, '-1.854')] [2024-08-05 07:39:35,493][00132] Saving /kaggle/working/train_dir/default_experiment/checkpoint_p0/checkpoint_000000970_7946240.pth... [2024-08-05 07:39:35,568][00132] Removing /kaggle/working/train_dir/default_experiment/checkpoint_p0/checkpoint_000000937_7675904.pth [2024-08-05 07:39:40,483][00034] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1110.8). Total num frames: 7946240. Throughput: 0: 282.8. Samples: 1988306. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 07:39:40,485][00034] Avg episode reward: [(0, '-1.854')] [2024-08-05 07:39:45,483][00034] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1110.8). Total num frames: 7954432. Throughput: 0: 282.5. Samples: 1989171. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 07:39:45,484][00034] Avg episode reward: [(0, '-1.854')] [2024-08-05 07:39:48,650][00139] DAMAGECOUNT value on done: 94994.0 [2024-08-05 07:39:48,651][00139] Sum rewards: -4.877, reward structure: {'DEATHCOUNT': '-11.250', 'HEALTH': '-1.182', 'ARMOR': '0.004', 'AMMO2': '0.008', 'AMMO5': '0.015', 'AMMO4': '0.041', 'WEAPON4': '0.100', 'weapon4': '0.136', 'weapon5': '0.154', 'AMMO3': '0.184', 'HITCOUNT': '0.190', 'WEAPON5': '0.250', 'FRAGCOUNT': '1.000', 'WEAPON3': '1.050', 'weapon2': '1.126', 'DAMAGECOUNT': '1.242', 'weapon3': '2.054'} [2024-08-05 07:39:48,925][00139] DAMAGECOUNT value on done: 100945.0 [2024-08-05 07:39:48,925][00139] Sum rewards: 0.848, reward structure: {'DEATHCOUNT': '-9.750', 'HEALTH': '-0.920', 'AMMO5': '0.007', 'weapon5': '0.012', 'AMMO2': '0.020', 'ARMOR': '0.036', 'AMMO4': '0.100', 'WEAPON5': '0.150', 'AMMO3': '0.170', 'WEAPON4': '0.250', 'weapon4': '0.406', 'HITCOUNT': '0.470', 'weapon2': '0.910', 'WEAPON3': '0.950', 'DAMAGECOUNT': '1.818', 'weapon3': '2.218', 'FRAGCOUNT': '4.000'} [2024-08-05 07:39:50,483][00034] Fps is (10 sec: 1638.4, 60 sec: 1228.8, 300 sec: 1138.5). Total num frames: 7962624. Throughput: 0: 282.9. Samples: 1990851. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 07:39:50,485][00034] Avg episode reward: [(0, '-1.888')] [2024-08-05 07:39:55,483][00034] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1110.8). Total num frames: 7962624. Throughput: 0: 283.3. Samples: 1992582. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 07:39:55,484][00034] Avg episode reward: [(0, '-1.888')] [2024-08-05 07:40:00,483][00034] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1110.8). Total num frames: 7970816. Throughput: 0: 283.8. Samples: 1993453. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 07:40:00,484][00034] Avg episode reward: [(0, '-1.888')] [2024-08-05 07:40:03,401][00139] DAMAGECOUNT value on done: 95251.0 [2024-08-05 07:40:03,402][00139] Sum rewards: -1.577, reward structure: {'DEATHCOUNT': '-9.750', 'HEALTH': '-0.506', 'AMMO5': '0.007', 'AMMO2': '0.024', 'ARMOR': '0.044', 'weapon5': '0.070', 'AMMO4': '0.117', 'weapon4': '0.150', 'HITCOUNT': '0.160', 'AMMO3': '0.163', 'WEAPON4': '0.200', 'WEAPON5': '0.200', 'WEAPON3': '0.750', 'DAMAGECOUNT': '0.771', 'weapon3': '1.464', 'weapon2': '1.558', 'FRAGCOUNT': '3.000'} [2024-08-05 07:40:03,640][00139] DAMAGECOUNT value on done: 101405.0 [2024-08-05 07:40:03,641][00139] Sum rewards: -5.796, reward structure: {'DEATHCOUNT': '-14.250', 'HEALTH': '-2.898', 'AMMO2': '0.003', 'AMMO5': '0.007', 'AMMO4': '0.016', 'weapon5': '0.018', 'WEAPON4': '0.050', 'weapon4': '0.052', 'WEAPON5': '0.150', 'AMMO3': '0.271', 'HITCOUNT': '0.380', 'DAMAGECOUNT': '1.380', 'WEAPON3': '1.450', 'weapon2': '1.460', 'weapon3': '2.114', 'FRAGCOUNT': '4.000'} [2024-08-05 07:40:05,483][00034] Fps is (10 sec: 1638.4, 60 sec: 1092.3, 300 sec: 1138.5). Total num frames: 7979008. Throughput: 0: 283.3. Samples: 1995130. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 07:40:05,484][00034] Avg episode reward: [(0, '-1.857')] [2024-08-05 07:40:10,483][00034] Fps is (10 sec: 1638.4, 60 sec: 1228.8, 300 sec: 1138.5). Total num frames: 7987200. Throughput: 0: 281.9. Samples: 1996787. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 07:40:10,485][00034] Avg episode reward: [(0, '-1.857')] [2024-08-05 07:40:15,483][00034] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1110.8). Total num frames: 7987200. Throughput: 0: 282.4. Samples: 1997649. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 07:40:15,484][00034] Avg episode reward: [(0, '-1.857')] [2024-08-05 07:40:18,385][00139] DAMAGECOUNT value on done: 95413.0 [2024-08-05 07:40:18,386][00139] Sum rewards: 1.119, reward structure: {'DEATHCOUNT': '-6.000', 'HEALTH': '-0.470', 'AMMO2': '0.003', 'ARMOR': '0.008', 'AMMO4': '0.015', 'AMMO5': '0.018', 'weapon7': '0.064', 'weapon5': '0.066', 'AMMO3': '0.109', 'HITCOUNT': '0.130', 'WEAPON4': '0.150', 'WEAPON5': '0.200', 'weapon4': '0.218', 'AMMO6': '0.320', 'AMMO7': '0.320', 'WEAPON7': '0.400', 'DAMAGECOUNT': '0.486', 'WEAPON3': '0.650', 'weapon2': '0.768', 'weapon3': '1.664', 'FRAGCOUNT': '2.000'} [2024-08-05 07:40:18,619][00139] DAMAGECOUNT value on done: 101460.0 [2024-08-05 07:40:20,483][00034] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1138.5). Total num frames: 7995392. Throughput: 0: 282.6. Samples: 1999330. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 07:40:20,485][00034] Avg episode reward: [(0, '-1.777')] [2024-08-05 07:40:25,483][00034] Fps is (10 sec: 1638.4, 60 sec: 1228.8, 300 sec: 1138.5). Total num frames: 8003584. Throughput: 0: 283.4. Samples: 2001057. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 07:40:25,484][00034] Avg episode reward: [(0, '-1.777')] [2024-08-05 07:40:30,483][00034] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1110.8). Total num frames: 8003584. Throughput: 0: 283.5. Samples: 2001930. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 07:40:30,485][00034] Avg episode reward: [(0, '-1.777')] [2024-08-05 07:40:33,063][00139] DAMAGECOUNT value on done: 95623.0 [2024-08-05 07:40:33,064][00139] Sum rewards: -3.707, reward structure: {'DEATHCOUNT': '-9.750', 'HEALTH': '-2.133', 'AMMO5': '0.005', 'AMMO2': '0.009', 'weapon7': '0.024', 'AMMO4': '0.044', 'WEAPON5': '0.050', 'AMMO6': '0.100', 'WEAPON7': '0.100', 'AMMO7': '0.100', 'ARMOR': '0.112', 'HITCOUNT': '0.130', 'WEAPON4': '0.150', 'AMMO3': '0.196', 'weapon4': '0.250', 'DAMAGECOUNT': '0.630', 'weapon2': '1.050', 'WEAPON3': '1.150', 'FRAGCOUNT': '2.000', 'weapon3': '2.076'} [2024-08-05 07:40:33,273][00139] DAMAGECOUNT value on done: 101530.0 [2024-08-05 07:40:35,483][00034] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1138.5). Total num frames: 8011776. Throughput: 0: 283.9. Samples: 2003628. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 07:40:35,484][00034] Avg episode reward: [(0, '-1.824')] [2024-08-05 07:40:40,483][00034] Fps is (10 sec: 1638.4, 60 sec: 1228.8, 300 sec: 1138.5). Total num frames: 8019968. Throughput: 0: 282.2. Samples: 2005283. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 07:40:40,484][00034] Avg episode reward: [(0, '-1.824')] [2024-08-05 07:40:45,483][00034] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1110.8). Total num frames: 8019968. Throughput: 0: 282.4. Samples: 2006161. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 07:40:45,484][00034] Avg episode reward: [(0, '-1.824')] [2024-08-05 07:40:46,249][00138] Updated weights for policy 0, policy_version 980 (0.0017) [2024-08-05 07:40:47,939][00139] DAMAGECOUNT value on done: 95778.0 [2024-08-05 07:40:47,940][00139] Sum rewards: -8.000, reward structure: {'DEATHCOUNT': '-14.250', 'HEALTH': '-1.716', 'AMMO2': '0.004', 'AMMO5': '0.007', 'ARMOR': '0.020', 'AMMO4': '0.021', 'weapon5': '0.026', 'weapon4': '0.052', 'WEAPON5': '0.150', 'WEAPON4': '0.150', 'HITCOUNT': '0.190', 'AMMO3': '0.226', 'DAMAGECOUNT': '0.465', 'WEAPON3': '1.200', 'weapon2': '1.520', 'weapon3': '1.934', 'FRAGCOUNT': '2.000'} [2024-08-05 07:40:48,151][00139] DAMAGECOUNT value on done: 101827.0 [2024-08-05 07:40:48,151][00139] Sum rewards: -5.148, reward structure: {'DEATHCOUNT': '-10.500', 'FRAGCOUNT': '-1.000', 'AMMO2': '0.018', 'AMMO5': '0.022', 'weapon4': '0.032', 'ARMOR': '0.036', 'WEAPON4': '0.050', 'AMMO4': '0.090', 'AMMO3': '0.140', 'HITCOUNT': '0.190', 'weapon5': '0.192', 'WEAPON5': '0.350', 'HEALTH': '0.424', 'WEAPON3': '0.700', 'DAMAGECOUNT': '0.891', 'weapon2': '1.388', 'weapon3': '1.828'} [2024-08-05 07:40:50,483][00034] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1138.5). Total num frames: 8028160. Throughput: 0: 283.1. Samples: 2007871. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 07:40:50,484][00034] Avg episode reward: [(0, '-1.944')] [2024-08-05 07:40:55,483][00034] Fps is (10 sec: 1638.4, 60 sec: 1228.8, 300 sec: 1138.5). Total num frames: 8036352. Throughput: 0: 284.5. Samples: 2009588. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 07:40:55,486][00034] Avg episode reward: [(0, '-1.944')] [2024-08-05 07:41:00,483][00034] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1110.8). Total num frames: 8036352. Throughput: 0: 284.2. Samples: 2010439. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 07:41:00,485][00034] Avg episode reward: [(0, '-1.944')] [2024-08-05 07:41:02,647][00139] DAMAGECOUNT value on done: 96048.0 [2024-08-05 07:41:02,647][00139] Sum rewards: -7.823, reward structure: {'DEATHCOUNT': '-11.250', 'HEALTH': '-1.897', 'FRAGCOUNT': '-0.500', 'AMMO2': '0.003', 'AMMO5': '0.013', 'AMMO4': '0.013', 'ARMOR': '0.028', 'weapon5': '0.054', 'weapon4': '0.088', 'WEAPON4': '0.100', 'AMMO3': '0.191', 'WEAPON5': '0.200', 'HITCOUNT': '0.250', 'DAMAGECOUNT': '0.810', 'WEAPON3': '1.100', 'weapon2': '1.338', 'weapon3': '1.636'} [2024-08-05 07:41:02,880][00139] DAMAGECOUNT value on done: 102122.0 [2024-08-05 07:41:02,881][00139] Sum rewards: -6.209, reward structure: {'DEATHCOUNT': '-13.500', 'HEALTH': '-1.542', 'AMMO5': '0.017', 'ARMOR': '0.026', 'AMMO2': '0.035', 'weapon5': '0.084', 'weapon4': '0.122', 'AMMO4': '0.173', 'AMMO3': '0.182', 'HITCOUNT': '0.280', 'WEAPON5': '0.350', 'WEAPON4': '0.400', 'WEAPON3': '0.850', 'DAMAGECOUNT': '0.885', 'weapon3': '1.462', 'weapon2': '1.966', 'FRAGCOUNT': '2.000'} [2024-08-05 07:41:05,483][00034] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1110.8). Total num frames: 8044544. Throughput: 0: 285.1. Samples: 2012158. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 07:41:05,484][00034] Avg episode reward: [(0, '-2.069')] [2024-08-05 07:41:10,483][00034] Fps is (10 sec: 1638.4, 60 sec: 1092.3, 300 sec: 1138.5). Total num frames: 8052736. Throughput: 0: 284.4. Samples: 2013856. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 07:41:10,484][00034] Avg episode reward: [(0, '-2.069')] [2024-08-05 07:41:15,483][00034] Fps is (10 sec: 1638.4, 60 sec: 1228.8, 300 sec: 1138.5). Total num frames: 8060928. Throughput: 0: 284.3. Samples: 2014725. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 07:41:15,484][00034] Avg episode reward: [(0, '-2.069')] [2024-08-05 07:41:17,306][00139] DAMAGECOUNT value on done: 96655.0 [2024-08-05 07:41:17,306][00139] Sum rewards: 4.438, reward structure: {'DEATHCOUNT': '-4.500', 'HEALTH': '-0.891', 'AMMO2': '0.008', 'AMMO5': '0.015', 'AMMO4': '0.039', 'AMMO3': '0.077', 'WEAPON4': '0.100', 'weapon4': '0.168', 'weapon5': '0.178', 'WEAPON5': '0.200', 'HITCOUNT': '0.210', 'WEAPON3': '0.450', 'ARMOR': '0.883', 'weapon3': '1.084', 'weapon2': '1.596', 'DAMAGECOUNT': '1.821', 'FRAGCOUNT': '3.000'} [2024-08-05 07:41:17,534][00139] DAMAGECOUNT value on done: 102379.0 [2024-08-05 07:41:17,534][00139] Sum rewards: -6.958, reward structure: {'DEATHCOUNT': '-12.750', 'HEALTH': '-1.705', 'weapon5': '0.006', 'AMMO5': '0.013', 'AMMO2': '0.020', 'ARMOR': '0.048', 'weapon7': '0.064', 'AMMO6': '0.100', 'AMMO7': '0.100', 'WEAPON7': '0.100', 'AMMO4': '0.101', 'AMMO3': '0.140', 'WEAPON4': '0.150', 'WEAPON5': '0.200', 'HITCOUNT': '0.240', 'weapon4': '0.284', 'DAMAGECOUNT': '0.771', 'WEAPON3': '0.900', 'FRAGCOUNT': '1.000', 'weapon2': '1.472', 'weapon3': '1.788'} [2024-08-05 07:41:20,483][00034] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1110.8). Total num frames: 8060928. Throughput: 0: 285.2. Samples: 2016464. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 07:41:20,484][00034] Avg episode reward: [(0, '-2.019')] [2024-08-05 07:41:25,483][00034] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1138.5). Total num frames: 8069120. Throughput: 0: 286.3. Samples: 2018167. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 07:41:25,484][00034] Avg episode reward: [(0, '-2.019')] [2024-08-05 07:41:30,483][00034] Fps is (10 sec: 1638.4, 60 sec: 1228.8, 300 sec: 1138.5). Total num frames: 8077312. Throughput: 0: 286.0. Samples: 2019033. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 07:41:30,487][00034] Avg episode reward: [(0, '-2.019')] [2024-08-05 07:41:32,045][00139] DAMAGECOUNT value on done: 96987.0 [2024-08-05 07:41:32,045][00139] Sum rewards: 0.739, reward structure: {'DEATHCOUNT': '-6.750', 'HEALTH': '-0.571', 'AMMO5': '0.004', 'AMMO2': '0.027', 'weapon5': '0.042', 'ARMOR': '0.048', 'WEAPON5': '0.100', 'AMMO4': '0.134', 'AMMO3': '0.135', 'HITCOUNT': '0.260', 'WEAPON4': '0.300', 'weapon4': '0.338', 'WEAPON3': '0.650', 'DAMAGECOUNT': '0.996', 'weapon2': '1.398', 'weapon3': '1.628', 'FRAGCOUNT': '2.000'} [2024-08-05 07:41:32,296][00139] DAMAGECOUNT value on done: 102649.0 [2024-08-05 07:41:35,483][00034] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1110.8). Total num frames: 8077312. Throughput: 0: 285.9. Samples: 2020738. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 07:41:35,484][00034] Avg episode reward: [(0, '-2.042')] [2024-08-05 07:41:35,493][00132] Saving /kaggle/working/train_dir/default_experiment/checkpoint_p0/checkpoint_000000986_8077312.pth... [2024-08-05 07:41:35,569][00132] Removing /kaggle/working/train_dir/default_experiment/checkpoint_p0/checkpoint_000000953_7806976.pth [2024-08-05 07:41:40,483][00034] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1138.5). Total num frames: 8085504. Throughput: 0: 284.8. Samples: 2022402. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 07:41:40,484][00034] Avg episode reward: [(0, '-2.042')] [2024-08-05 07:41:45,483][00034] Fps is (10 sec: 1638.4, 60 sec: 1228.8, 300 sec: 1138.5). Total num frames: 8093696. Throughput: 0: 284.6. Samples: 2023246. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 07:41:45,485][00034] Avg episode reward: [(0, '-2.042')] [2024-08-05 07:41:46,895][00139] DAMAGECOUNT value on done: 97170.0 [2024-08-05 07:41:46,895][00139] Sum rewards: -7.657, reward structure: {'DEATHCOUNT': '-12.000', 'HEALTH': '-2.436', 'weapon5': '0.002', 'AMMO5': '0.010', 'AMMO2': '0.017', 'ARMOR': '0.064', 'AMMO4': '0.085', 'WEAPON5': '0.100', 'weapon4': '0.168', 'AMMO3': '0.180', 'HITCOUNT': '0.180', 'WEAPON4': '0.250', 'DAMAGECOUNT': '0.549', 'WEAPON3': '0.900', 'FRAGCOUNT': '1.000', 'weapon2': '1.610', 'weapon3': '1.664'} [2024-08-05 07:41:47,119][00139] DAMAGECOUNT value on done: 102864.0 [2024-08-05 07:41:47,120][00139] Sum rewards: -5.692, reward structure: {'DEATHCOUNT': '-12.000', 'HEALTH': '-1.864', 'AMMO5': '0.006', 'weapon4': '0.016', 'AMMO2': '0.030', 'weapon5': '0.036', 'ARMOR': '0.064', 'WEAPON4': '0.100', 'WEAPON5': '0.150', 'AMMO4': '0.151', 'AMMO3': '0.171', 'HITCOUNT': '0.230', 'DAMAGECOUNT': '0.645', 'WEAPON3': '1.100', 'weapon2': '1.244', 'FRAGCOUNT': '2.000', 'weapon3': '2.228'} [2024-08-05 07:41:50,483][00034] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1110.8). Total num frames: 8093696. Throughput: 0: 284.2. Samples: 2024949. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 07:41:50,486][00034] Avg episode reward: [(0, '-2.120')] [2024-08-05 07:41:55,483][00034] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1138.5). Total num frames: 8101888. Throughput: 0: 284.4. Samples: 2026652. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 07:41:55,485][00034] Avg episode reward: [(0, '-2.120')] [2024-08-05 07:41:58,150][00138] Updated weights for policy 0, policy_version 990 (0.0016) [2024-08-05 07:42:00,483][00034] Fps is (10 sec: 1638.4, 60 sec: 1228.8, 300 sec: 1138.5). Total num frames: 8110080. Throughput: 0: 284.5. Samples: 2027529. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 07:42:00,485][00034] Avg episode reward: [(0, '-2.120')] [2024-08-05 07:42:01,742][00139] DAMAGECOUNT value on done: 97395.0 [2024-08-05 07:42:01,993][00139] DAMAGECOUNT value on done: 103119.0 [2024-08-05 07:42:01,994][00139] Sum rewards: -1.101, reward structure: {'DEATHCOUNT': '-8.250', 'HEALTH': '-1.404', 'AMMO2': '0.013', 'AMMO5': '0.020', 'AMMO4': '0.064', 'ARMOR': '0.072', 'AMMO3': '0.109', 'weapon5': '0.130', 'WEAPON4': '0.150', 'AMMO6': '0.200', 'WEAPON7': '0.200', 'AMMO7': '0.200', 'HITCOUNT': '0.230', 'weapon4': '0.240', 'WEAPON5': '0.300', 'WEAPON3': '0.700', 'DAMAGECOUNT': '0.765', 'weapon2': '1.562', 'weapon3': '1.598', 'FRAGCOUNT': '2.000'} [2024-08-05 07:42:05,483][00034] Fps is (10 sec: 1638.4, 60 sec: 1228.8, 300 sec: 1138.5). Total num frames: 8118272. Throughput: 0: 283.6. Samples: 2029228. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 07:42:05,484][00034] Avg episode reward: [(0, '-2.136')] [2024-08-05 07:42:10,483][00034] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1138.5). Total num frames: 8118272. Throughput: 0: 282.8. Samples: 2030893. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 07:42:10,486][00034] Avg episode reward: [(0, '-2.136')] [2024-08-05 07:42:15,483][00034] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1138.5). Total num frames: 8126464. Throughput: 0: 281.3. Samples: 2031690. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 07:42:15,484][00034] Avg episode reward: [(0, '-2.136')] [2024-08-05 07:42:16,716][00139] DAMAGECOUNT value on done: 97496.0 [2024-08-05 07:42:16,717][00139] Sum rewards: -5.245, reward structure: {'DEATHCOUNT': '-8.250', 'HEALTH': '-1.360', 'FRAGCOUNT': '-0.500', 'AMMO5': '0.010', 'AMMO2': '0.011', 'AMMO4': '0.053', 'ARMOR': '0.072', 'weapon4': '0.082', 'HITCOUNT': '0.110', 'AMMO3': '0.114', 'weapon5': '0.116', 'WEAPON4': '0.150', 'WEAPON5': '0.150', 'DAMAGECOUNT': '0.303', 'WEAPON3': '0.700', 'weapon3': '1.356', 'weapon2': '1.638'} [2024-08-05 07:42:16,954][00139] DAMAGECOUNT value on done: 103386.0 [2024-08-05 07:42:16,954][00139] Sum rewards: -3.464, reward structure: {'DEATHCOUNT': '-8.250', 'HEALTH': '-1.495', 'AMMO4': '-0.044', 'AMMO2': '-0.009', 'AMMO5': '0.003', 'WEAPON1': '0.020', 'weapon5': '0.078', 'WEAPON5': '0.100', 'AMMO3': '0.158', 'HITCOUNT': '0.240', 'DAMAGECOUNT': '0.801', 'WEAPON3': '0.950', 'FRAGCOUNT': '1.000', 'weapon3': '1.370', 'weapon2': '1.614'} [2024-08-05 07:42:20,483][00034] Fps is (10 sec: 1638.4, 60 sec: 1228.8, 300 sec: 1138.5). Total num frames: 8134656. Throughput: 0: 281.2. Samples: 2033394. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 07:42:20,485][00034] Avg episode reward: [(0, '-2.202')] [2024-08-05 07:42:25,483][00034] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1138.5). Total num frames: 8134656. Throughput: 0: 281.9. Samples: 2035087. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 07:42:25,485][00034] Avg episode reward: [(0, '-2.202')] [2024-08-05 07:42:30,483][00034] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1138.5). Total num frames: 8142848. Throughput: 0: 282.8. Samples: 2035971. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 07:42:30,485][00034] Avg episode reward: [(0, '-2.202')] [2024-08-05 07:42:31,510][00139] DAMAGECOUNT value on done: 97641.0 [2024-08-05 07:42:31,511][00139] Sum rewards: -5.698, reward structure: {'DEATHCOUNT': '-9.000', 'FRAGCOUNT': '-1.500', 'HEALTH': '-0.984', 'AMMO2': '0.017', 'AMMO5': '0.025', 'AMMO4': '0.087', 'ARMOR': '0.088', 'weapon5': '0.120', 'HITCOUNT': '0.130', 'AMMO3': '0.146', 'WEAPON5': '0.400', 'DAMAGECOUNT': '0.435', 'WEAPON3': '0.900', 'weapon2': '1.460', 'weapon3': '1.978'} [2024-08-05 07:42:31,750][00139] DAMAGECOUNT value on done: 103931.0 [2024-08-05 07:42:31,750][00139] Sum rewards: -0.488, reward structure: {'DEATHCOUNT': '-9.000', 'HEALTH': '-2.189', 'AMMO2': '0.004', 'AMMO4': '0.019', 'AMMO5': '0.020', 'weapon5': '0.106', 'AMMO3': '0.133', 'WEAPON5': '0.300', 'HITCOUNT': '0.310', 'WEAPON3': '0.700', 'weapon2': '1.520', 'DAMAGECOUNT': '1.635', 'weapon3': '1.954', 'FRAGCOUNT': '4.000'} [2024-08-05 07:42:35,483][00034] Fps is (10 sec: 1638.4, 60 sec: 1228.8, 300 sec: 1138.5). Total num frames: 8151040. Throughput: 0: 282.9. Samples: 2037678. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 07:42:35,485][00034] Avg episode reward: [(0, '-2.219')] [2024-08-05 07:42:40,483][00034] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1110.8). Total num frames: 8151040. Throughput: 0: 281.6. Samples: 2039326. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 07:42:40,484][00034] Avg episode reward: [(0, '-2.219')] [2024-08-05 07:42:45,483][00034] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1138.5). Total num frames: 8159232. Throughput: 0: 280.3. Samples: 2040143. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 07:42:45,484][00034] Avg episode reward: [(0, '-2.219')] [2024-08-05 07:42:46,719][00139] DAMAGECOUNT value on done: 97866.0 [2024-08-05 07:42:46,719][00139] Sum rewards: -6.985, reward structure: {'DEATHCOUNT': '-11.250', 'HEALTH': '-1.499', 'FRAGCOUNT': '-0.500', 'AMMO5': '0.020', 'WEAPON1': '0.020', 'AMMO2': '0.050', 'ARMOR': '0.064', 'weapon5': '0.120', 'AMMO3': '0.138', 'WEAPON4': '0.150', 'AMMO4': '0.249', 'HITCOUNT': '0.260', 'weapon4': '0.264', 'WEAPON5': '0.450', 'DAMAGECOUNT': '0.675', 'WEAPON3': '0.850', 'weapon2': '1.372', 'weapon3': '1.582'} [2024-08-05 07:42:46,953][00139] DAMAGECOUNT value on done: 104191.0 [2024-08-05 07:42:46,954][00139] Sum rewards: -2.570, reward structure: {'DEATHCOUNT': '-9.750', 'HEALTH': '-0.974', 'AMMO5': '0.012', 'AMMO2': '0.027', 'ARMOR': '0.032', 'weapon5': '0.120', 'AMMO4': '0.133', 'WEAPON5': '0.150', 'WEAPON4': '0.150', 'AMMO3': '0.168', 'HITCOUNT': '0.240', 'weapon4': '0.302', 'DAMAGECOUNT': '0.780', 'WEAPON3': '0.950', 'weapon2': '1.428', 'weapon3': '1.662', 'FRAGCOUNT': '2.000'} [2024-08-05 07:42:50,483][00034] Fps is (10 sec: 1638.4, 60 sec: 1228.8, 300 sec: 1138.5). Total num frames: 8167424. Throughput: 0: 279.6. Samples: 2041812. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 07:42:50,485][00034] Avg episode reward: [(0, '-2.276')] [2024-08-05 07:42:55,483][00034] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1110.8). Total num frames: 8167424. Throughput: 0: 280.8. Samples: 2043530. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 07:42:55,485][00034] Avg episode reward: [(0, '-2.276')] [2024-08-05 07:43:00,483][00034] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1138.5). Total num frames: 8175616. Throughput: 0: 281.6. Samples: 2044364. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 07:43:00,485][00034] Avg episode reward: [(0, '-2.276')] [2024-08-05 07:43:01,580][00139] DAMAGECOUNT value on done: 98382.0 [2024-08-05 07:43:01,580][00139] Sum rewards: 0.330, reward structure: {'DEATHCOUNT': '-10.500', 'HEALTH': '-1.510', 'AMMO2': '0.006', 'AMMO5': '0.012', 'WEAPON1': '0.020', 'AMMO4': '0.032', 'AMMO3': '0.136', 'weapon5': '0.152', 'WEAPON5': '0.300', 'HITCOUNT': '0.360', 'ARMOR': '0.527', 'WEAPON3': '0.900', 'weapon2': '1.384', 'DAMAGECOUNT': '1.548', 'weapon3': '1.962', 'FRAGCOUNT': '5.000'} [2024-08-05 07:43:01,795][00139] DAMAGECOUNT value on done: 104301.0 [2024-08-05 07:43:05,483][00034] Fps is (10 sec: 1638.4, 60 sec: 1092.3, 300 sec: 1138.5). Total num frames: 8183808. Throughput: 0: 281.4. Samples: 2046056. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 07:43:05,484][00034] Avg episode reward: [(0, '-2.282')] [2024-08-05 07:43:10,483][00034] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1110.8). Total num frames: 8183808. Throughput: 0: 282.7. Samples: 2047809. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 07:43:10,485][00034] Avg episode reward: [(0, '-2.282')] [2024-08-05 07:43:10,881][00138] Updated weights for policy 0, policy_version 1000 (0.0018) [2024-08-05 07:43:15,483][00034] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1138.5). Total num frames: 8192000. Throughput: 0: 282.1. Samples: 2048667. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 07:43:15,487][00034] Avg episode reward: [(0, '-2.282')] [2024-08-05 07:43:16,321][00139] DAMAGECOUNT value on done: 99211.0 [2024-08-05 07:43:16,322][00139] Sum rewards: 3.845, reward structure: {'DEATHCOUNT': '-9.000', 'HEALTH': '-1.270', 'AMMO2': '0.004', 'weapon7': '0.012', 'AMMO5': '0.015', 'AMMO4': '0.020', 'WEAPON1': '0.020', 'WEAPON4': '0.100', 'AMMO6': '0.100', 'WEAPON7': '0.100', 'AMMO7': '0.100', 'weapon5': '0.122', 'AMMO3': '0.161', 'WEAPON5': '0.300', 'HITCOUNT': '0.420', 'WEAPON3': '0.950', 'weapon2': '1.234', 'weapon3': '1.970', 'DAMAGECOUNT': '2.487', 'FRAGCOUNT': '6.000'} [2024-08-05 07:43:16,558][00139] DAMAGECOUNT value on done: 104670.0 [2024-08-05 07:43:16,558][00139] Sum rewards: -0.404, reward structure: {'DEATHCOUNT': '-8.250', 'HEALTH': '-0.756', 'AMMO5': '0.008', 'AMMO2': '0.017', 'weapon5': '0.064', 'AMMO4': '0.087', 'weapon7': '0.098', 'AMMO6': '0.100', 'AMMO7': '0.100', 'WEAPON7': '0.100', 'AMMO3': '0.125', 'WEAPON4': '0.150', 'HITCOUNT': '0.180', 'WEAPON5': '0.200', 'weapon4': '0.312', 'WEAPON3': '0.700', 'weapon2': '1.000', 'DAMAGECOUNT': '1.107', 'weapon3': '1.254', 'FRAGCOUNT': '3.000'} [2024-08-05 07:43:20,483][00034] Fps is (10 sec: 1638.4, 60 sec: 1092.3, 300 sec: 1138.5). Total num frames: 8200192. Throughput: 0: 281.5. Samples: 2050344. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 07:43:20,485][00034] Avg episode reward: [(0, '-2.195')] [2024-08-05 07:43:25,483][00034] Fps is (10 sec: 1638.4, 60 sec: 1228.8, 300 sec: 1138.5). Total num frames: 8208384. Throughput: 0: 283.3. Samples: 2052073. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 07:43:25,484][00034] Avg episode reward: [(0, '-2.195')] [2024-08-05 07:43:30,483][00034] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1138.5). Total num frames: 8208384. Throughput: 0: 283.8. Samples: 2052912. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 07:43:30,485][00034] Avg episode reward: [(0, '-2.195')] [2024-08-05 07:43:31,041][00139] DAMAGECOUNT value on done: 99390.0 [2024-08-05 07:43:31,042][00139] Sum rewards: -3.868, reward structure: {'DEATHCOUNT': '-9.750', 'HEALTH': '-0.349', 'AMMO5': '0.013', 'AMMO2': '0.017', 'weapon4': '0.020', 'ARMOR': '0.028', 'weapon5': '0.054', 'AMMO4': '0.083', 'WEAPON4': '0.100', 'HITCOUNT': '0.150', 'AMMO3': '0.152', 'WEAPON5': '0.200', 'DAMAGECOUNT': '0.537', 'WEAPON3': '0.650', 'FRAGCOUNT': '1.000', 'weapon3': '1.610', 'weapon2': '1.618'} [2024-08-05 07:43:31,254][00139] DAMAGECOUNT value on done: 104894.0 [2024-08-05 07:43:31,255][00139] Sum rewards: -4.037, reward structure: {'DEATHCOUNT': '-9.000', 'HEALTH': '-1.900', 'AMMO2': '0.001', 'AMMO4': '0.005', 'AMMO5': '0.013', 'ARMOR': '0.032', 'WEAPON5': '0.100', 'WEAPON4': '0.100', 'AMMO3': '0.168', 'HITCOUNT': '0.200', 'weapon4': '0.260', 'DAMAGECOUNT': '0.672', 'WEAPON3': '0.900', 'FRAGCOUNT': '1.000', 'weapon2': '1.672', 'weapon3': '1.740'} [2024-08-05 07:43:35,484][00034] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1138.5). Total num frames: 8216576. Throughput: 0: 284.2. Samples: 2054599. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 07:43:35,485][00034] Avg episode reward: [(0, '-2.298')] [2024-08-05 07:43:35,493][00132] Saving /kaggle/working/train_dir/default_experiment/checkpoint_p0/checkpoint_000001003_8216576.pth... [2024-08-05 07:43:35,568][00132] Removing /kaggle/working/train_dir/default_experiment/checkpoint_p0/checkpoint_000000970_7946240.pth [2024-08-05 07:43:40,483][00034] Fps is (10 sec: 1638.4, 60 sec: 1228.8, 300 sec: 1138.5). Total num frames: 8224768. Throughput: 0: 283.6. Samples: 2056294. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 07:43:40,486][00034] Avg episode reward: [(0, '-2.298')] [2024-08-05 07:43:45,483][00034] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1138.5). Total num frames: 8224768. Throughput: 0: 284.8. Samples: 2057178. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 07:43:45,484][00034] Avg episode reward: [(0, '-2.298')] [2024-08-05 07:43:45,645][00139] DAMAGECOUNT value on done: 99678.0 [2024-08-05 07:43:45,645][00139] Sum rewards: -0.327, reward structure: {'DEATHCOUNT': '-6.000', 'HEALTH': '-0.546', 'AMMO5': '0.003', 'AMMO2': '0.019', 'AMMO3': '0.071', 'weapon5': '0.072', 'AMMO4': '0.093', 'WEAPON5': '0.100', 'HITCOUNT': '0.200', 'WEAPON3': '0.400', 'ARMOR': '0.400', 'DAMAGECOUNT': '0.864', 'FRAGCOUNT': '1.000', 'weapon3': '1.420', 'weapon2': '1.578'} [2024-08-05 07:43:45,864][00139] DAMAGECOUNT value on done: 105224.0 [2024-08-05 07:43:45,864][00139] Sum rewards: -4.054, reward structure: {'DEATHCOUNT': '-9.750', 'HEALTH': '-1.390', 'weapon4': '0.004', 'WEAPON1': '0.010', 'AMMO2': '0.012', 'ARMOR': '0.040', 'AMMO4': '0.060', 'weapon7': '0.078', 'AMMO6': '0.100', 'AMMO7': '0.100', 'WEAPON7': '0.100', 'AMMO3': '0.101', 'WEAPON4': '0.150', 'HITCOUNT': '0.210', 'WEAPON3': '0.650', 'DAMAGECOUNT': '0.990', 'FRAGCOUNT': '1.000', 'weapon3': '1.576', 'weapon2': '1.904'} [2024-08-05 07:43:50,483][00034] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1138.5). Total num frames: 8232960. Throughput: 0: 284.3. Samples: 2058851. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 07:43:50,485][00034] Avg episode reward: [(0, '-2.360')] [2024-08-05 07:43:55,483][00034] Fps is (10 sec: 1638.4, 60 sec: 1228.8, 300 sec: 1138.5). Total num frames: 8241152. Throughput: 0: 283.7. Samples: 2060574. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 07:43:55,484][00034] Avg episode reward: [(0, '-2.360')] [2024-08-05 07:44:00,483][00034] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1110.8). Total num frames: 8241152. Throughput: 0: 283.3. Samples: 2061417. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 07:44:00,484][00034] Avg episode reward: [(0, '-2.360')] [2024-08-05 07:44:00,681][00139] DAMAGECOUNT value on done: 99865.0 [2024-08-05 07:44:00,682][00139] Sum rewards: -4.899, reward structure: {'DEATHCOUNT': '-9.750', 'HEALTH': '-1.980', 'AMMO2': '0.001', 'AMMO4': '0.004', 'weapon7': '0.006', 'ARMOR': '0.028', 'WEAPON4': '0.100', 'AMMO3': '0.113', 'weapon4': '0.130', 'HITCOUNT': '0.170', 'AMMO6': '0.200', 'WEAPON7': '0.200', 'AMMO7': '0.200', 'DAMAGECOUNT': '0.561', 'WEAPON3': '0.750', 'FRAGCOUNT': '1.000', 'weapon3': '1.446', 'weapon2': '1.922'} [2024-08-05 07:44:00,914][00139] DAMAGECOUNT value on done: 105623.0 [2024-08-05 07:44:00,915][00139] Sum rewards: -2.609, reward structure: {'DEATHCOUNT': '-11.250', 'HEALTH': '-0.928', 'AMMO2': '0.013', 'ARMOR': '0.024', 'AMMO5': '0.025', 'AMMO4': '0.064', 'weapon5': '0.090', 'weapon4': '0.094', 'WEAPON4': '0.150', 'AMMO3': '0.161', 'HITCOUNT': '0.200', 'WEAPON5': '0.300', 'WEAPON3': '0.900', 'DAMAGECOUNT': '1.197', 'weapon2': '1.328', 'weapon3': '2.024', 'FRAGCOUNT': '3.000'} [2024-08-05 07:44:05,483][00034] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1138.5). Total num frames: 8249344. Throughput: 0: 283.3. Samples: 2063091. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 07:44:05,486][00034] Avg episode reward: [(0, '-2.462')] [2024-08-05 07:44:10,483][00034] Fps is (10 sec: 1638.4, 60 sec: 1228.8, 300 sec: 1138.6). Total num frames: 8257536. Throughput: 0: 282.8. Samples: 2064801. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 07:44:10,485][00034] Avg episode reward: [(0, '-2.462')] [2024-08-05 07:44:15,483][00034] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1110.8). Total num frames: 8257536. Throughput: 0: 283.3. Samples: 2065659. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 07:44:15,485][00034] Avg episode reward: [(0, '-2.462')] [2024-08-05 07:44:15,502][00139] DAMAGECOUNT value on done: 100167.0 [2024-08-05 07:44:15,502][00139] Sum rewards: -3.362, reward structure: {'DEATHCOUNT': '-8.250', 'HEALTH': '-1.080', 'FRAGCOUNT': '-0.500', 'AMMO2': '0.012', 'AMMO5': '0.013', 'AMMO4': '0.059', 'weapon5': '0.100', 'WEAPON4': '0.150', 'AMMO3': '0.159', 'WEAPON5': '0.250', 'HITCOUNT': '0.280', 'weapon4': '0.408', 'ARMOR': '0.491', 'WEAPON3': '0.750', 'DAMAGECOUNT': '0.906', 'weapon3': '1.368', 'weapon2': '1.522'} [2024-08-05 07:44:15,751][00139] DAMAGECOUNT value on done: 106240.0 [2024-08-05 07:44:15,751][00139] Sum rewards: -0.214, reward structure: {'DEATHCOUNT': '-8.250', 'HEALTH': '-1.163', 'AMMO4': '-0.002', 'AMMO2': '-0.000', 'AMMO5': '0.007', 'WEAPON1': '0.020', 'weapon5': '0.048', 'WEAPON5': '0.100', 'AMMO3': '0.141', 'HITCOUNT': '0.440', 'WEAPON3': '0.800', 'weapon2': '1.326', 'DAMAGECOUNT': '1.851', 'weapon3': '1.968', 'FRAGCOUNT': '2.500'} [2024-08-05 07:44:20,483][00034] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1138.5). Total num frames: 8265728. Throughput: 0: 281.9. Samples: 2067283. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 07:44:20,485][00034] Avg episode reward: [(0, '-2.461')] [2024-08-05 07:44:23,309][00138] Updated weights for policy 0, policy_version 1010 (0.0017) [2024-08-05 07:44:25,483][00034] Fps is (10 sec: 1638.4, 60 sec: 1092.3, 300 sec: 1138.5). Total num frames: 8273920. Throughput: 0: 282.6. Samples: 2069011. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 07:44:25,484][00034] Avg episode reward: [(0, '-2.461')] [2024-08-05 07:44:30,405][00139] DAMAGECOUNT value on done: 100540.0 [2024-08-05 07:44:30,406][00139] Sum rewards: -2.260, reward structure: {'DEATHCOUNT': '-9.750', 'HEALTH': '-0.620', 'weapon5': '0.006', 'AMMO5': '0.010', 'AMMO2': '0.010', 'ARMOR': '0.040', 'WEAPON4': '0.050', 'AMMO4': '0.052', 'weapon4': '0.078', 'AMMO3': '0.155', 'WEAPON5': '0.200', 'HITCOUNT': '0.290', 'WEAPON3': '0.800', 'DAMAGECOUNT': '1.119', 'weapon2': '1.450', 'weapon3': '1.850', 'FRAGCOUNT': '2.000'} [2024-08-05 07:44:30,483][00034] Fps is (10 sec: 1638.4, 60 sec: 1228.8, 300 sec: 1138.5). Total num frames: 8282112. Throughput: 0: 282.4. Samples: 2069888. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 07:44:30,484][00034] Avg episode reward: [(0, '-2.500')] [2024-08-05 07:44:30,650][00139] DAMAGECOUNT value on done: 106700.0 [2024-08-05 07:44:30,651][00139] Sum rewards: 1.468, reward structure: {'DEATHCOUNT': '-8.250', 'HEALTH': '-1.700', 'AMMO4': '-0.032', 'AMMO2': '-0.006', 'AMMO5': '0.010', 'WEAPON1': '0.020', 'AMMO3': '0.102', 'weapon5': '0.126', 'WEAPON5': '0.250', 'HITCOUNT': '0.290', 'WEAPON3': '0.750', 'DAMAGECOUNT': '1.380', 'weapon3': '1.496', 'weapon2': '2.032', 'FRAGCOUNT': '5.000'} [2024-08-05 07:44:35,483][00034] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1138.5). Total num frames: 8282112. Throughput: 0: 283.7. Samples: 2071619. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 07:44:35,485][00034] Avg episode reward: [(0, '-2.411')] [2024-08-05 07:44:40,483][00034] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1138.5). Total num frames: 8290304. Throughput: 0: 283.5. Samples: 2073332. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 07:44:40,484][00034] Avg episode reward: [(0, '-2.411')] [2024-08-05 07:44:44,983][00139] DAMAGECOUNT value on done: 100973.0 [2024-08-05 07:44:44,984][00139] Sum rewards: -7.322, reward structure: {'DEATHCOUNT': '-13.500', 'HEALTH': '-2.576', 'AMMO2': '0.018', 'AMMO5': '0.022', 'weapon5': '0.038', 'weapon4': '0.040', 'ARMOR': '0.084', 'AMMO4': '0.089', 'WEAPON4': '0.150', 'AMMO3': '0.230', 'HITCOUNT': '0.230', 'WEAPON5': '0.400', 'WEAPON3': '1.200', 'DAMAGECOUNT': '1.299', 'weapon2': '1.450', 'FRAGCOUNT': '1.500', 'weapon3': '2.004'} [2024-08-05 07:44:45,222][00139] DAMAGECOUNT value on done: 106892.0 [2024-08-05 07:44:45,223][00139] Sum rewards: -0.828, reward structure: {'DEATHCOUNT': '-6.750', 'HEALTH': '-0.624', 'AMMO5': '0.009', 'AMMO2': '0.009', 'AMMO4': '0.047', 'AMMO3': '0.085', 'weapon5': '0.124', 'HITCOUNT': '0.140', 'WEAPON5': '0.150', 'WEAPON4': '0.200', 'weapon4': '0.404', 'ARMOR': '0.514', 'WEAPON3': '0.550', 'DAMAGECOUNT': '0.576', 'FRAGCOUNT': '1.000', 'weapon2': '1.006', 'weapon3': '1.732'} [2024-08-05 07:44:45,483][00034] Fps is (10 sec: 1638.4, 60 sec: 1228.8, 300 sec: 1138.5). Total num frames: 8298496. Throughput: 0: 284.2. Samples: 2074208. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 07:44:45,485][00034] Avg episode reward: [(0, '-2.397')] [2024-08-05 07:44:50,483][00034] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1138.5). Total num frames: 8298496. Throughput: 0: 285.2. Samples: 2075927. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 07:44:50,486][00034] Avg episode reward: [(0, '-2.397')] [2024-08-05 07:44:55,483][00034] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1138.5). Total num frames: 8306688. Throughput: 0: 284.7. Samples: 2077612. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 07:44:55,485][00034] Avg episode reward: [(0, '-2.397')] [2024-08-05 07:44:59,712][00139] DAMAGECOUNT value on done: 101323.0 [2024-08-05 07:44:59,713][00139] Sum rewards: -0.371, reward structure: {'DEATHCOUNT': '-6.000', 'HEALTH': '-1.104', 'AMMO5': '0.005', 'AMMO2': '0.008', 'AMMO4': '0.040', 'ARMOR': '0.056', 'weapon5': '0.090', 'WEAPON5': '0.100', 'AMMO3': '0.116', 'WEAPON4': '0.200', 'HITCOUNT': '0.210', 'weapon4': '0.240', 'weapon2': '0.518', 'WEAPON3': '0.700', 'DAMAGECOUNT': '1.050', 'weapon3': '1.400', 'FRAGCOUNT': '2.000'} [2024-08-05 07:44:59,935][00139] DAMAGECOUNT value on done: 107076.0 [2024-08-05 07:44:59,935][00139] Sum rewards: -2.138, reward structure: {'DEATHCOUNT': '-8.250', 'HEALTH': '-0.598', 'AMMO4': '-0.029', 'AMMO2': '-0.006', 'weapon5': '0.008', 'AMMO5': '0.013', 'WEAPON4': '0.100', 'AMMO3': '0.110', 'WEAPON5': '0.150', 'HITCOUNT': '0.180', 'ARMOR': '0.468', 'WEAPON3': '0.550', 'DAMAGECOUNT': '0.552', 'FRAGCOUNT': '1.000', 'weapon2': '1.798', 'weapon3': '1.816'} [2024-08-05 07:45:00,483][00034] Fps is (10 sec: 1638.4, 60 sec: 1228.8, 300 sec: 1138.5). Total num frames: 8314880. Throughput: 0: 285.3. Samples: 2078496. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 07:45:00,485][00034] Avg episode reward: [(0, '-2.356')] [2024-08-05 07:45:05,483][00034] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1110.8). Total num frames: 8314880. Throughput: 0: 287.6. Samples: 2080224. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 07:45:05,485][00034] Avg episode reward: [(0, '-2.356')] [2024-08-05 07:45:10,483][00034] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1138.5). Total num frames: 8323072. Throughput: 0: 286.9. Samples: 2081923. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 07:45:10,485][00034] Avg episode reward: [(0, '-2.356')] [2024-08-05 07:45:14,267][00139] DAMAGECOUNT value on done: 101468.0 [2024-08-05 07:45:14,268][00139] Sum rewards: -5.542, reward structure: {'DEATHCOUNT': '-12.000', 'HEALTH': '-1.297', 'AMMO4': '-0.001', 'AMMO2': '-0.000', 'AMMO5': '0.013', 'weapon5': '0.020', 'HITCOUNT': '0.140', 'WEAPON5': '0.150', 'AMMO3': '0.167', 'DAMAGECOUNT': '0.435', 'ARMOR': '0.519', 'WEAPON3': '1.000', 'weapon2': '1.412', 'weapon3': '1.900', 'FRAGCOUNT': '2.000'} [2024-08-05 07:45:14,480][00139] DAMAGECOUNT value on done: 107265.0 [2024-08-05 07:45:15,483][00034] Fps is (10 sec: 1638.4, 60 sec: 1228.8, 300 sec: 1138.5). Total num frames: 8331264. Throughput: 0: 287.5. Samples: 2082825. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 07:45:15,485][00034] Avg episode reward: [(0, '-2.424')] [2024-08-05 07:45:20,483][00034] Fps is (10 sec: 1638.4, 60 sec: 1228.8, 300 sec: 1138.5). Total num frames: 8339456. Throughput: 0: 286.6. Samples: 2084517. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 07:45:20,484][00034] Avg episode reward: [(0, '-2.424')] [2024-08-05 07:45:25,483][00034] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1138.5). Total num frames: 8339456. Throughput: 0: 285.8. Samples: 2086193. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 07:45:25,485][00034] Avg episode reward: [(0, '-2.424')] [2024-08-05 07:45:29,156][00139] DAMAGECOUNT value on done: 101786.0 [2024-08-05 07:45:29,157][00139] Sum rewards: 2.314, reward structure: {'DEATHCOUNT': '-5.250', 'HEALTH': '-1.150', 'AMMO2': '0.007', 'AMMO5': '0.015', 'WEAPON1': '0.030', 'AMMO4': '0.036', 'weapon7': '0.048', 'AMMO3': '0.076', 'weapon4': '0.100', 'AMMO6': '0.120', 'AMMO7': '0.120', 'weapon5': '0.170', 'WEAPON4': '0.200', 'WEAPON7': '0.200', 'HITCOUNT': '0.230', 'WEAPON5': '0.350', 'ARMOR': '0.482', 'WEAPON3': '0.600', 'DAMAGECOUNT': '0.954', 'weapon2': '1.340', 'weapon3': '1.636', 'FRAGCOUNT': '2.000'} [2024-08-05 07:45:29,374][00139] DAMAGECOUNT value on done: 107547.0 [2024-08-05 07:45:29,375][00139] Sum rewards: -6.318, reward structure: {'DEATHCOUNT': '-11.250', 'HEALTH': '-0.740', 'FRAGCOUNT': '-0.500', 'AMMO2': '0.013', 'ARMOR': '0.020', 'AMMO5': '0.025', 'AMMO4': '0.067', 'weapon4': '0.068', 'weapon5': '0.072', 'WEAPON4': '0.100', 'AMMO3': '0.155', 'HITCOUNT': '0.250', 'WEAPON5': '0.300', 'DAMAGECOUNT': '0.846', 'WEAPON3': '0.900', 'weapon2': '1.482', 'weapon3': '1.874'} [2024-08-05 07:45:30,483][00034] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1138.5). Total num frames: 8347648. Throughput: 0: 285.3. Samples: 2087045. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 07:45:30,484][00034] Avg episode reward: [(0, '-2.402')] [2024-08-05 07:45:34,801][00138] Updated weights for policy 0, policy_version 1020 (0.0018) [2024-08-05 07:45:35,483][00034] Fps is (10 sec: 1638.4, 60 sec: 1228.8, 300 sec: 1138.5). Total num frames: 8355840. Throughput: 0: 285.8. Samples: 2088789. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 07:45:35,484][00034] Avg episode reward: [(0, '-2.402')] [2024-08-05 07:45:35,496][00132] Saving /kaggle/working/train_dir/default_experiment/checkpoint_p0/checkpoint_000001020_8355840.pth... [2024-08-05 07:45:35,570][00132] Removing /kaggle/working/train_dir/default_experiment/checkpoint_p0/checkpoint_000000986_8077312.pth [2024-08-05 07:45:40,483][00034] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1138.5). Total num frames: 8355840. Throughput: 0: 286.4. Samples: 2090502. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 07:45:40,485][00034] Avg episode reward: [(0, '-2.402')] [2024-08-05 07:45:43,725][00139] DAMAGECOUNT value on done: 101996.0 [2024-08-05 07:45:43,726][00139] Sum rewards: -1.825, reward structure: {'DEATHCOUNT': '-9.750', 'HEALTH': '-0.552', 'AMMO5': '0.005', 'weapon5': '0.008', 'AMMO2': '0.011', 'weapon4': '0.036', 'AMMO4': '0.056', 'WEAPON4': '0.100', 'WEAPON5': '0.100', 'AMMO3': '0.179', 'HITCOUNT': '0.210', 'ARMOR': '0.499', 'DAMAGECOUNT': '0.630', 'WEAPON3': '0.950', 'weapon2': '1.482', 'FRAGCOUNT': '2.000', 'weapon3': '2.210'} [2024-08-05 07:45:43,960][00139] DAMAGECOUNT value on done: 108030.0 [2024-08-05 07:45:43,960][00139] Sum rewards: -2.606, reward structure: {'DEATHCOUNT': '-9.750', 'HEALTH': '-1.190', 'AMMO5': '0.005', 'AMMO2': '0.020', 'weapon5': '0.032', 'ARMOR': '0.074', 'WEAPON5': '0.100', 'AMMO4': '0.101', 'AMMO3': '0.148', 'HITCOUNT': '0.260', 'WEAPON4': '0.300', 'weapon4': '0.384', 'WEAPON3': '0.700', 'weapon2': '1.374', 'weapon3': '1.386', 'DAMAGECOUNT': '1.449', 'FRAGCOUNT': '2.000'} [2024-08-05 07:45:45,483][00034] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1138.5). Total num frames: 8364032. Throughput: 0: 285.8. Samples: 2091358. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 07:45:45,485][00034] Avg episode reward: [(0, '-2.329')] [2024-08-05 07:45:50,483][00034] Fps is (10 sec: 1638.4, 60 sec: 1228.8, 300 sec: 1138.5). Total num frames: 8372224. Throughput: 0: 285.2. Samples: 2093057. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 07:45:50,484][00034] Avg episode reward: [(0, '-2.329')] [2024-08-05 07:45:55,483][00034] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1138.5). Total num frames: 8372224. Throughput: 0: 284.0. Samples: 2094703. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 07:45:55,484][00034] Avg episode reward: [(0, '-2.329')] [2024-08-05 07:45:58,926][00139] DAMAGECOUNT value on done: 102226.0 [2024-08-05 07:45:58,927][00139] Sum rewards: -4.264, reward structure: {'DEATHCOUNT': '-6.750', 'FRAGCOUNT': '-2.000', 'HEALTH': '-0.748', 'AMMO2': '0.009', 'AMMO5': '0.015', 'ARMOR': '0.040', 'AMMO4': '0.045', 'AMMO3': '0.124', 'weapon5': '0.130', 'WEAPON5': '0.200', 'HITCOUNT': '0.210', 'DAMAGECOUNT': '0.690', 'WEAPON3': '0.700', 'weapon2': '1.262', 'weapon3': '1.808'} [2024-08-05 07:45:59,148][00139] DAMAGECOUNT value on done: 108488.0 [2024-08-05 07:45:59,148][00139] Sum rewards: -1.998, reward structure: {'DEATHCOUNT': '-9.750', 'HEALTH': '-1.683', 'AMMO4': '-0.088', 'AMMO2': '-0.018', 'AMMO5': '0.009', 'weapon5': '0.046', 'AMMO3': '0.122', 'WEAPON5': '0.150', 'HITCOUNT': '0.360', 'ARMOR': '0.559', 'WEAPON3': '0.700', 'DAMAGECOUNT': '1.374', 'weapon3': '1.854', 'weapon2': '1.866', 'FRAGCOUNT': '2.500'} [2024-08-05 07:46:00,483][00034] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1138.5). Total num frames: 8380416. Throughput: 0: 282.2. Samples: 2095525. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 07:46:00,484][00034] Avg episode reward: [(0, '-2.392')] [2024-08-05 07:46:05,483][00034] Fps is (10 sec: 1638.4, 60 sec: 1228.8, 300 sec: 1138.5). Total num frames: 8388608. Throughput: 0: 282.2. Samples: 2097214. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 07:46:05,485][00034] Avg episode reward: [(0, '-2.392')] [2024-08-05 07:46:10,483][00034] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1110.8). Total num frames: 8388608. Throughput: 0: 282.8. Samples: 2098920. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 07:46:10,485][00034] Avg episode reward: [(0, '-2.392')] [2024-08-05 07:46:13,707][00139] DAMAGECOUNT value on done: 102619.0 [2024-08-05 07:46:13,707][00139] Sum rewards: 1.421, reward structure: {'DEATHCOUNT': '-6.750', 'AMMO5': '0.007', 'WEAPON1': '0.010', 'AMMO2': '0.022', 'ARMOR': '0.060', 'AMMO3': '0.070', 'AMMO4': '0.109', 'WEAPON4': '0.150', 'weapon5': '0.180', 'WEAPON5': '0.200', 'HITCOUNT': '0.210', 'HEALTH': '0.222', 'weapon4': '0.290', 'WEAPON3': '0.450', 'DAMAGECOUNT': '1.179', 'weapon2': '1.446', 'weapon3': '1.566', 'FRAGCOUNT': '2.000'} [2024-08-05 07:46:13,944][00139] DAMAGECOUNT value on done: 108786.0 [2024-08-05 07:46:13,944][00139] Sum rewards: -3.941, reward structure: {'DEATHCOUNT': '-9.000', 'HEALTH': '-1.620', 'AMMO5': '0.003', 'AMMO2': '0.007', 'weapon5': '0.014', 'ARMOR': '0.024', 'AMMO4': '0.034', 'WEAPON5': '0.050', 'AMMO3': '0.121', 'weapon7': '0.164', 'WEAPON4': '0.200', 'AMMO6': '0.220', 'AMMO7': '0.220', 'HITCOUNT': '0.260', 'WEAPON7': '0.300', 'weapon4': '0.376', 'FRAGCOUNT': '0.500', 'WEAPON3': '0.600', 'DAMAGECOUNT': '0.894', 'weapon3': '1.092', 'weapon2': '1.600'} [2024-08-05 07:46:15,483][00034] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1138.5). Total num frames: 8396800. Throughput: 0: 282.6. Samples: 2099764. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 07:46:15,485][00034] Avg episode reward: [(0, '-2.384')] [2024-08-05 07:46:20,483][00034] Fps is (10 sec: 1638.4, 60 sec: 1092.3, 300 sec: 1138.5). Total num frames: 8404992. Throughput: 0: 280.1. Samples: 2101394. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 07:46:20,484][00034] Avg episode reward: [(0, '-2.384')] [2024-08-05 07:46:25,483][00034] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1110.8). Total num frames: 8404992. Throughput: 0: 279.0. Samples: 2103059. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 07:46:25,485][00034] Avg episode reward: [(0, '-2.384')] [2024-08-05 07:46:29,100][00139] DAMAGECOUNT value on done: 102814.0 [2024-08-05 07:46:29,101][00139] Sum rewards: -5.407, reward structure: {'DEATHCOUNT': '-11.250', 'HEALTH': '-2.108', 'AMMO5': '0.007', 'AMMO2': '0.009', 'weapon5': '0.014', 'AMMO4': '0.046', 'WEAPON5': '0.100', 'AMMO3': '0.179', 'HITCOUNT': '0.190', 'WEAPON4': '0.300', 'weapon4': '0.320', 'ARMOR': '0.478', 'DAMAGECOUNT': '0.585', 'WEAPON3': '0.850', 'weapon3': '1.340', 'weapon2': '1.532', 'FRAGCOUNT': '2.000'} [2024-08-05 07:46:29,317][00139] DAMAGECOUNT value on done: 109311.0 [2024-08-05 07:46:29,317][00139] Sum rewards: -4.285, reward structure: {'DEATHCOUNT': '-12.750', 'WEAPON1': '0.010', 'AMMO5': '0.015', 'AMMO2': '0.021', 'HEALTH': '0.064', 'weapon7': '0.074', 'weapon5': '0.098', 'AMMO6': '0.100', 'AMMO7': '0.100', 'WEAPON7': '0.100', 'AMMO4': '0.107', 'AMMO3': '0.115', 'WEAPON4': '0.150', 'weapon4': '0.214', 'WEAPON5': '0.300', 'HITCOUNT': '0.390', 'WEAPON3': '0.550', 'weapon3': '1.110', 'FRAGCOUNT': '1.500', 'DAMAGECOUNT': '1.575', 'weapon2': '1.872'} [2024-08-05 07:46:30,483][00034] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1138.5). Total num frames: 8413184. Throughput: 0: 278.1. Samples: 2103873. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 07:46:30,484][00034] Avg episode reward: [(0, '-2.436')] [2024-08-05 07:46:35,483][00034] Fps is (10 sec: 1638.4, 60 sec: 1092.3, 300 sec: 1138.5). Total num frames: 8421376. Throughput: 0: 278.7. Samples: 2105597. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 07:46:35,485][00034] Avg episode reward: [(0, '-2.436')] [2024-08-05 07:46:40,483][00034] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1110.8). Total num frames: 8421376. Throughput: 0: 279.3. Samples: 2107273. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 07:46:40,484][00034] Avg episode reward: [(0, '-2.436')] [2024-08-05 07:46:43,876][00139] DAMAGECOUNT value on done: 103154.0 [2024-08-05 07:46:43,877][00139] Sum rewards: -6.155, reward structure: {'DEATHCOUNT': '-14.250', 'HEALTH': '-1.620', 'AMMO2': '0.014', 'AMMO5': '0.017', 'ARMOR': '0.032', 'weapon5': '0.050', 'AMMO4': '0.071', 'WEAPON4': '0.150', 'AMMO3': '0.168', 'HITCOUNT': '0.210', 'WEAPON5': '0.250', 'weapon4': '0.378', 'WEAPON3': '0.750', 'DAMAGECOUNT': '1.020', 'weapon3': '1.530', 'weapon2': '1.574', 'FRAGCOUNT': '3.500'} [2024-08-05 07:46:44,090][00139] DAMAGECOUNT value on done: 109540.0 [2024-08-05 07:46:44,090][00139] Sum rewards: -4.007, reward structure: {'DEATHCOUNT': '-8.250', 'HEALTH': '-2.390', 'AMMO2': '0.006', 'AMMO5': '0.013', 'AMMO4': '0.032', 'WEAPON1': '0.040', 'weapon7': '0.052', 'AMMO6': '0.120', 'AMMO7': '0.120', 'AMMO3': '0.150', 'WEAPON4': '0.200', 'WEAPON7': '0.200', 'weapon4': '0.200', 'HITCOUNT': '0.210', 'WEAPON5': '0.250', 'weapon5': '0.318', 'FRAGCOUNT': '0.500', 'DAMAGECOUNT': '0.687', 'WEAPON3': '0.800', 'weapon3': '1.240', 'weapon2': '1.494'} [2024-08-05 07:46:45,483][00034] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1138.5). Total num frames: 8429568. Throughput: 0: 279.5. Samples: 2108103. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 07:46:45,486][00034] Avg episode reward: [(0, '-2.481')] [2024-08-05 07:46:47,806][00138] Updated weights for policy 0, policy_version 1030 (0.0017) [2024-08-05 07:46:50,483][00034] Fps is (10 sec: 1638.4, 60 sec: 1092.3, 300 sec: 1138.5). Total num frames: 8437760. Throughput: 0: 280.3. Samples: 2109828. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 07:46:50,484][00034] Avg episode reward: [(0, '-2.481')] [2024-08-05 07:46:55,483][00034] Fps is (10 sec: 1638.4, 60 sec: 1228.8, 300 sec: 1138.5). Total num frames: 8445952. Throughput: 0: 279.5. Samples: 2111496. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 07:46:55,484][00034] Avg episode reward: [(0, '-2.481')] [2024-08-05 07:46:59,106][00139] DAMAGECOUNT value on done: 103834.0 [2024-08-05 07:46:59,107][00139] Sum rewards: 0.856, reward structure: {'DEATHCOUNT': '-9.750', 'HEALTH': '-0.956', 'AMMO5': '0.015', 'WEAPON1': '0.020', 'AMMO2': '0.024', 'weapon7': '0.050', 'ARMOR': '0.064', 'AMMO6': '0.100', 'AMMO7': '0.100', 'WEAPON7': '0.100', 'weapon5': '0.100', 'AMMO4': '0.120', 'AMMO3': '0.141', 'WEAPON4': '0.150', 'weapon4': '0.154', 'WEAPON5': '0.350', 'HITCOUNT': '0.370', 'WEAPON3': '0.750', 'weapon2': '1.414', 'weapon3': '1.500', 'DAMAGECOUNT': '2.040', 'FRAGCOUNT': '4.000'} [2024-08-05 07:46:59,349][00139] DAMAGECOUNT value on done: 109794.0 [2024-08-05 07:46:59,350][00139] Sum rewards: -4.789, reward structure: {'DEATHCOUNT': '-9.000', 'HEALTH': '-1.850', 'FRAGCOUNT': '0.000', 'AMMO2': '0.007', 'AMMO5': '0.019', 'AMMO4': '0.035', 'ARMOR': '0.040', 'WEAPON4': '0.050', 'AMMO3': '0.115', 'weapon4': '0.146', 'weapon5': '0.160', 'HITCOUNT': '0.210', 'WEAPON5': '0.300', 'DAMAGECOUNT': '0.762', 'WEAPON3': '0.850', 'weapon2': '1.500', 'weapon3': '1.866'} [2024-08-05 07:47:00,483][00034] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1110.8). Total num frames: 8445952. Throughput: 0: 277.8. Samples: 2112264. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 07:47:00,485][00034] Avg episode reward: [(0, '-2.453')] [2024-08-05 07:47:05,483][00034] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1138.5). Total num frames: 8454144. Throughput: 0: 279.1. Samples: 2113955. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 07:47:05,484][00034] Avg episode reward: [(0, '-2.453')] [2024-08-05 07:47:10,483][00034] Fps is (10 sec: 1638.4, 60 sec: 1228.8, 300 sec: 1138.5). Total num frames: 8462336. Throughput: 0: 278.9. Samples: 2115608. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 07:47:10,484][00034] Avg episode reward: [(0, '-2.453')] [2024-08-05 07:47:14,116][00139] DAMAGECOUNT value on done: 103851.0 [2024-08-05 07:47:14,347][00139] DAMAGECOUNT value on done: 110344.0 [2024-08-05 07:47:14,348][00139] Sum rewards: -3.894, reward structure: {'DEATHCOUNT': '-9.000', 'HEALTH': '-2.331', 'AMMO4': '-0.036', 'AMMO2': '-0.007', 'AMMO5': '0.019', 'ARMOR': '0.084', 'weapon5': '0.090', 'AMMO3': '0.149', 'WEAPON5': '0.200', 'HITCOUNT': '0.310', 'FRAGCOUNT': '0.500', 'WEAPON3': '0.950', 'weapon2': '1.584', 'DAMAGECOUNT': '1.650', 'weapon3': '1.944'} [2024-08-05 07:47:15,483][00034] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1110.8). Total num frames: 8462336. Throughput: 0: 279.4. Samples: 2116447. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 07:47:15,484][00034] Avg episode reward: [(0, '-2.463')] [2024-08-05 07:47:20,483][00034] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1138.5). Total num frames: 8470528. Throughput: 0: 278.4. Samples: 2118124. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 07:47:20,484][00034] Avg episode reward: [(0, '-2.463')] [2024-08-05 07:47:25,483][00034] Fps is (10 sec: 1638.4, 60 sec: 1228.8, 300 sec: 1138.5). Total num frames: 8478720. Throughput: 0: 279.6. Samples: 2119856. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 07:47:25,484][00034] Avg episode reward: [(0, '-2.463')] [2024-08-05 07:47:29,135][00139] DAMAGECOUNT value on done: 104409.0 [2024-08-05 07:47:29,136][00139] Sum rewards: 1.521, reward structure: {'DEATHCOUNT': '-6.750', 'HEALTH': '-1.482', 'AMMO4': '-0.035', 'AMMO2': '-0.007', 'weapon5': '0.008', 'AMMO5': '0.010', 'ARMOR': '0.040', 'AMMO6': '0.100', 'AMMO7': '0.100', 'WEAPON7': '0.100', 'weapon7': '0.104', 'AMMO3': '0.138', 'WEAPON5': '0.200', 'HITCOUNT': '0.320', 'WEAPON3': '0.750', 'weapon3': '1.528', 'DAMAGECOUNT': '1.674', 'weapon2': '1.722', 'FRAGCOUNT': '3.000'} [2024-08-05 07:47:29,366][00139] DAMAGECOUNT value on done: 110824.0 [2024-08-05 07:47:29,367][00139] Sum rewards: -0.907, reward structure: {'DEATHCOUNT': '-7.500', 'HEALTH': '-1.107', 'AMMO4': '-0.020', 'AMMO2': '-0.004', 'AMMO5': '0.005', 'weapon7': '0.030', 'ARMOR': '0.036', 'AMMO3': '0.109', 'weapon5': '0.142', 'WEAPON5': '0.150', 'AMMO6': '0.160', 'AMMO7': '0.160', 'WEAPON7': '0.200', 'HITCOUNT': '0.330', 'WEAPON3': '0.550', 'FRAGCOUNT': '1.000', 'DAMAGECOUNT': '1.440', 'weapon3': '1.534', 'weapon2': '1.878'} [2024-08-05 07:47:30,483][00034] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1110.8). Total num frames: 8478720. Throughput: 0: 279.5. Samples: 2120679. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 07:47:30,485][00034] Avg episode reward: [(0, '-2.493')] [2024-08-05 07:47:35,483][00034] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1138.5). Total num frames: 8486912. Throughput: 0: 277.8. Samples: 2122327. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 07:47:35,484][00034] Avg episode reward: [(0, '-2.493')] [2024-08-05 07:47:35,493][00132] Saving /kaggle/working/train_dir/default_experiment/checkpoint_p0/checkpoint_000001036_8486912.pth... [2024-08-05 07:47:35,578][00132] Removing /kaggle/working/train_dir/default_experiment/checkpoint_p0/checkpoint_000001003_8216576.pth [2024-08-05 07:47:40,483][00034] Fps is (10 sec: 1638.4, 60 sec: 1228.8, 300 sec: 1138.5). Total num frames: 8495104. Throughput: 0: 278.2. Samples: 2124013. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 07:47:40,484][00034] Avg episode reward: [(0, '-2.493')] [2024-08-05 07:47:44,114][00139] DAMAGECOUNT value on done: 104942.0 [2024-08-05 07:47:44,115][00139] Sum rewards: -2.806, reward structure: {'DEATHCOUNT': '-9.750', 'HEALTH': '-0.382', 'FRAGCOUNT': '0.000', 'AMMO2': '0.015', 'WEAPON1': '0.020', 'AMMO5': '0.029', 'ARMOR': '0.040', 'AMMO4': '0.075', 'WEAPON4': '0.100', 'AMMO3': '0.128', 'weapon5': '0.176', 'weapon4': '0.232', 'HITCOUNT': '0.370', 'WEAPON5': '0.550', 'WEAPON3': '0.800', 'weapon2': '1.358', 'DAMAGECOUNT': '1.599', 'weapon3': '1.834'} [2024-08-05 07:47:44,326][00139] DAMAGECOUNT value on done: 111359.0 [2024-08-05 07:47:44,327][00139] Sum rewards: -4.637, reward structure: {'DEATHCOUNT': '-10.500', 'HEALTH': '-1.842', 'AMMO2': '0.005', 'weapon5': '0.016', 'AMMO5': '0.017', 'AMMO4': '0.025', 'WEAPON4': '0.050', 'ARMOR': '0.056', 'weapon4': '0.092', 'WEAPON5': '0.150', 'AMMO3': '0.183', 'HITCOUNT': '0.420', 'FRAGCOUNT': '0.500', 'WEAPON3': '1.000', 'weapon2': '1.372', 'DAMAGECOUNT': '1.605', 'weapon3': '2.214'} [2024-08-05 07:47:45,483][00034] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1110.8). Total num frames: 8495104. Throughput: 0: 280.0. Samples: 2124862. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 07:47:45,484][00034] Avg episode reward: [(0, '-2.595')] [2024-08-05 07:47:50,483][00034] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1138.5). Total num frames: 8503296. Throughput: 0: 279.8. Samples: 2126545. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 07:47:50,484][00034] Avg episode reward: [(0, '-2.595')] [2024-08-05 07:47:55,483][00034] Fps is (10 sec: 1638.4, 60 sec: 1092.3, 300 sec: 1138.5). Total num frames: 8511488. Throughput: 0: 281.1. Samples: 2128258. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 07:47:55,484][00034] Avg episode reward: [(0, '-2.595')] [2024-08-05 07:47:58,947][00139] DAMAGECOUNT value on done: 105347.0 [2024-08-05 07:47:58,948][00139] Sum rewards: -3.222, reward structure: {'DEATHCOUNT': '-10.500', 'HEALTH': '-0.831', 'AMMO2': '0.007', 'AMMO5': '0.020', 'AMMO4': '0.037', 'ARMOR': '0.040', 'weapon5': '0.070', 'AMMO3': '0.146', 'WEAPON5': '0.300', 'HITCOUNT': '0.350', 'WEAPON3': '0.900', 'DAMAGECOUNT': '1.215', 'weapon2': '1.464', 'FRAGCOUNT': '1.500', 'weapon3': '2.060'} [2024-08-05 07:47:59,185][00139] DAMAGECOUNT value on done: 111574.0 [2024-08-05 07:47:59,186][00139] Sum rewards: -2.707, reward structure: {'DEATHCOUNT': '-7.500', 'HEALTH': '-1.895', 'weapon5': '0.006', 'AMMO5': '0.010', 'AMMO2': '0.018', 'WEAPON1': '0.020', 'ARMOR': '0.020', 'AMMO4': '0.090', 'AMMO3': '0.116', 'HITCOUNT': '0.190', 'WEAPON5': '0.200', 'WEAPON4': '0.250', 'weapon4': '0.332', 'DAMAGECOUNT': '0.645', 'WEAPON3': '0.800', 'FRAGCOUNT': '1.000', 'weapon2': '1.314', 'weapon3': '1.676'} [2024-08-05 07:48:00,483][00034] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1110.8). Total num frames: 8511488. Throughput: 0: 281.2. Samples: 2129103. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 07:48:00,485][00034] Avg episode reward: [(0, '-2.622')] [2024-08-05 07:48:01,325][00138] Updated weights for policy 0, policy_version 1040 (0.0018) [2024-08-05 07:48:05,483][00034] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1138.5). Total num frames: 8519680. Throughput: 0: 279.9. Samples: 2130721. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 07:48:05,485][00034] Avg episode reward: [(0, '-2.622')] [2024-08-05 07:48:10,483][00034] Fps is (10 sec: 1638.4, 60 sec: 1092.3, 300 sec: 1138.5). Total num frames: 8527872. Throughput: 0: 278.8. Samples: 2132402. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 07:48:10,484][00034] Avg episode reward: [(0, '-2.622')] [2024-08-05 07:48:14,183][00139] DAMAGECOUNT value on done: 105622.0 [2024-08-05 07:48:14,183][00139] Sum rewards: -2.313, reward structure: {'DEATHCOUNT': '-7.500', 'FRAGCOUNT': '-0.500', 'HEALTH': '-0.446', 'AMMO5': '0.010', 'AMMO2': '0.018', 'weapon5': '0.030', 'weapon7': '0.082', 'AMMO4': '0.092', 'AMMO3': '0.100', 'WEAPON5': '0.100', 'AMMO6': '0.120', 'AMMO7': '0.120', 'ARMOR': '0.120', 'HITCOUNT': '0.180', 'WEAPON4': '0.200', 'WEAPON7': '0.200', 'weapon4': '0.434', 'WEAPON3': '0.550', 'DAMAGECOUNT': '0.825', 'weapon2': '1.354', 'weapon3': '1.598'} [2024-08-05 07:48:14,411][00139] DAMAGECOUNT value on done: 111859.0 [2024-08-05 07:48:14,411][00139] Sum rewards: -8.286, reward structure: {'DEATHCOUNT': '-12.000', 'HEALTH': '-2.292', 'FRAGCOUNT': '-0.500', 'AMMO2': '0.011', 'AMMO5': '0.023', 'ARMOR': '0.036', 'AMMO4': '0.057', 'weapon5': '0.060', 'weapon4': '0.082', 'WEAPON4': '0.100', 'HITCOUNT': '0.150', 'AMMO3': '0.186', 'WEAPON5': '0.350', 'DAMAGECOUNT': '0.855', 'WEAPON3': '1.050', 'weapon2': '1.282', 'weapon3': '2.264'} [2024-08-05 07:48:15,483][00034] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1110.8). Total num frames: 8527872. Throughput: 0: 279.2. Samples: 2133241. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 07:48:15,484][00034] Avg episode reward: [(0, '-2.641')] [2024-08-05 07:48:20,483][00034] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1110.8). Total num frames: 8536064. Throughput: 0: 279.3. Samples: 2134897. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 07:48:20,485][00034] Avg episode reward: [(0, '-2.641')] [2024-08-05 07:48:25,483][00034] Fps is (10 sec: 1638.4, 60 sec: 1092.3, 300 sec: 1138.5). Total num frames: 8544256. Throughput: 0: 279.0. Samples: 2136566. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 07:48:25,485][00034] Avg episode reward: [(0, '-2.641')] [2024-08-05 07:48:29,256][00139] DAMAGECOUNT value on done: 105947.0 [2024-08-05 07:48:29,257][00139] Sum rewards: -0.627, reward structure: {'DEATHCOUNT': '-8.250', 'HEALTH': '-1.451', 'AMMO5': '0.010', 'AMMO2': '0.012', 'ARMOR': '0.032', 'AMMO4': '0.059', 'weapon5': '0.068', 'weapon7': '0.068', 'AMMO3': '0.086', 'AMMO6': '0.120', 'AMMO7': '0.120', 'WEAPON4': '0.200', 'WEAPON7': '0.200', 'HITCOUNT': '0.240', 'WEAPON5': '0.250', 'weapon4': '0.278', 'WEAPON3': '0.600', 'DAMAGECOUNT': '0.975', 'weapon2': '1.506', 'weapon3': '1.750', 'FRAGCOUNT': '2.500'} [2024-08-05 07:48:29,486][00139] DAMAGECOUNT value on done: 112307.0 [2024-08-05 07:48:29,487][00139] Sum rewards: -1.436, reward structure: {'DEATHCOUNT': '-10.500', 'HEALTH': '-0.793', 'AMMO5': '0.012', 'AMMO2': '0.016', 'ARMOR': '0.040', 'weapon7': '0.068', 'AMMO4': '0.080', 'WEAPON4': '0.100', 'weapon5': '0.106', 'weapon4': '0.110', 'AMMO3': '0.124', 'WEAPON5': '0.150', 'AMMO6': '0.160', 'AMMO7': '0.160', 'HITCOUNT': '0.170', 'WEAPON7': '0.200', 'WEAPON3': '0.550', 'DAMAGECOUNT': '1.344', 'weapon3': '1.568', 'weapon2': '1.898', 'FRAGCOUNT': '3.000'} [2024-08-05 07:48:30,483][00034] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1110.8). Total num frames: 8544256. Throughput: 0: 279.2. Samples: 2137428. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 07:48:30,484][00034] Avg episode reward: [(0, '-2.548')] [2024-08-05 07:48:35,483][00034] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1110.8). Total num frames: 8552448. Throughput: 0: 278.3. Samples: 2139069. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 07:48:35,484][00034] Avg episode reward: [(0, '-2.548')] [2024-08-05 07:48:40,483][00034] Fps is (10 sec: 1638.4, 60 sec: 1092.3, 300 sec: 1138.5). Total num frames: 8560640. Throughput: 0: 277.0. Samples: 2140722. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 07:48:40,485][00034] Avg episode reward: [(0, '-2.548')] [2024-08-05 07:48:44,284][00139] DAMAGECOUNT value on done: 106343.0 [2024-08-05 07:48:44,284][00139] Sum rewards: -6.074, reward structure: {'DEATHCOUNT': '-9.000', 'HEALTH': '-1.940', 'FRAGCOUNT': '-1.000', 'AMMO2': '0.007', 'AMMO5': '0.008', 'ARMOR': '0.020', 'AMMO4': '0.036', 'weapon5': '0.126', 'AMMO3': '0.140', 'HITCOUNT': '0.140', 'WEAPON5': '0.200', 'WEAPON3': '0.800', 'DAMAGECOUNT': '1.188', 'weapon3': '1.576', 'weapon2': '1.624'} [2024-08-05 07:48:44,522][00139] DAMAGECOUNT value on done: 112457.0 [2024-08-05 07:48:44,523][00139] Sum rewards: -1.897, reward structure: {'DEATHCOUNT': '-6.750', 'HEALTH': '-1.220', 'AMMO4': '-0.019', 'AMMO2': '-0.004', 'AMMO5': '0.012', 'weapon5': '0.066', 'WEAPON4': '0.100', 'AMMO3': '0.130', 'ARMOR': '0.133', 'WEAPON5': '0.150', 'HITCOUNT': '0.150', 'weapon4': '0.188', 'DAMAGECOUNT': '0.450', 'WEAPON3': '0.800', 'FRAGCOUNT': '1.000', 'weapon2': '1.350', 'weapon3': '1.566'} [2024-08-05 07:48:45,483][00034] Fps is (10 sec: 1638.4, 60 sec: 1228.8, 300 sec: 1138.5). Total num frames: 8568832. Throughput: 0: 277.5. Samples: 2141590. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 07:48:45,484][00034] Avg episode reward: [(0, '-2.582')] [2024-08-05 07:48:50,483][00034] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1110.8). Total num frames: 8568832. Throughput: 0: 279.1. Samples: 2143279. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 07:48:50,484][00034] Avg episode reward: [(0, '-2.582')] [2024-08-05 07:48:55,483][00034] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1138.5). Total num frames: 8577024. Throughput: 0: 279.9. Samples: 2144997. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 07:48:55,484][00034] Avg episode reward: [(0, '-2.582')] [2024-08-05 07:48:59,144][00139] DAMAGECOUNT value on done: 107041.0 [2024-08-05 07:48:59,144][00139] Sum rewards: 0.920, reward structure: {'DEATHCOUNT': '-9.000', 'HEALTH': '-1.169', 'AMMO2': '0.003', 'AMMO5': '0.007', 'AMMO4': '0.015', 'ARMOR': '0.036', 'WEAPON5': '0.100', 'AMMO3': '0.155', 'weapon5': '0.162', 'HITCOUNT': '0.340', 'WEAPON3': '0.950', 'weapon3': '1.604', 'weapon2': '1.622', 'DAMAGECOUNT': '2.094', 'FRAGCOUNT': '4.000'} [2024-08-05 07:48:59,376][00139] DAMAGECOUNT value on done: 112863.0 [2024-08-05 07:48:59,377][00139] Sum rewards: 2.211, reward structure: {'DEATHCOUNT': '-6.750', 'HEALTH': '-0.947', 'AMMO2': '0.012', 'AMMO5': '0.017', 'weapon4': '0.042', 'AMMO4': '0.061', 'weapon7': '0.068', 'AMMO6': '0.100', 'AMMO7': '0.100', 'WEAPON7': '0.100', 'AMMO3': '0.114', 'weapon5': '0.136', 'WEAPON4': '0.150', 'HITCOUNT': '0.240', 'WEAPON5': '0.350', 'ARMOR': '0.400', 'WEAPON3': '0.700', 'DAMAGECOUNT': '1.218', 'weapon2': '1.378', 'weapon3': '1.722', 'FRAGCOUNT': '3.000'} [2024-08-05 07:49:00,483][00034] Fps is (10 sec: 1638.4, 60 sec: 1228.8, 300 sec: 1138.5). Total num frames: 8585216. Throughput: 0: 280.4. Samples: 2145860. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 07:49:00,485][00034] Avg episode reward: [(0, '-2.568')] [2024-08-05 07:49:05,483][00034] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1110.8). Total num frames: 8585216. Throughput: 0: 280.4. Samples: 2147514. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 07:49:05,484][00034] Avg episode reward: [(0, '-2.568')] [2024-08-05 07:49:10,483][00034] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1138.5). Total num frames: 8593408. Throughput: 0: 281.5. Samples: 2149234. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 07:49:10,484][00034] Avg episode reward: [(0, '-2.568')] [2024-08-05 07:49:13,896][00139] DAMAGECOUNT value on done: 107266.0 [2024-08-05 07:49:13,897][00139] Sum rewards: -4.906, reward structure: {'DEATHCOUNT': '-10.500', 'HEALTH': '-1.260', 'AMMO2': '0.016', 'WEAPON4': '0.050', 'ARMOR': '0.060', 'AMMO4': '0.077', 'weapon7': '0.092', 'AMMO6': '0.120', 'AMMO7': '0.120', 'AMMO3': '0.144', 'HITCOUNT': '0.170', 'weapon4': '0.196', 'WEAPON7': '0.200', 'DAMAGECOUNT': '0.675', 'WEAPON3': '0.850', 'FRAGCOUNT': '1.000', 'weapon2': '1.408', 'weapon3': '1.676'} [2024-08-05 07:49:14,146][00139] DAMAGECOUNT value on done: 113133.0 [2024-08-05 07:49:14,147][00139] Sum rewards: -2.776, reward structure: {'DEATHCOUNT': '-8.250', 'HEALTH': '-2.380', 'AMMO4': '-0.065', 'AMMO2': '-0.013', 'AMMO5': '0.014', 'weapon5': '0.030', 'ARMOR': '0.048', 'WEAPON4': '0.100', 'AMMO3': '0.175', 'WEAPON5': '0.200', 'weapon4': '0.210', 'HITCOUNT': '0.220', 'DAMAGECOUNT': '0.810', 'WEAPON3': '0.950', 'weapon2': '1.456', 'weapon3': '1.718', 'FRAGCOUNT': '2.000'} [2024-08-05 07:49:14,251][00138] Updated weights for policy 0, policy_version 1050 (0.0018) [2024-08-05 07:49:15,483][00034] Fps is (10 sec: 1638.4, 60 sec: 1228.8, 300 sec: 1138.5). Total num frames: 8601600. Throughput: 0: 281.8. Samples: 2150110. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 07:49:15,485][00034] Avg episode reward: [(0, '-2.622')] [2024-08-05 07:49:20,483][00034] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1110.8). Total num frames: 8601600. Throughput: 0: 283.4. Samples: 2151824. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 07:49:20,485][00034] Avg episode reward: [(0, '-2.622')] [2024-08-05 07:49:25,483][00034] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1110.8). Total num frames: 8609792. Throughput: 0: 284.4. Samples: 2153518. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 07:49:25,484][00034] Avg episode reward: [(0, '-2.622')] [2024-08-05 07:49:28,958][00139] DAMAGECOUNT value on done: 107416.0 [2024-08-05 07:49:28,959][00139] Sum rewards: -2.807, reward structure: {'DEATHCOUNT': '-9.750', 'HEALTH': '-1.470', 'AMMO5': '0.007', 'AMMO2': '0.011', 'WEAPON1': '0.020', 'ARMOR': '0.036', 'AMMO4': '0.056', 'weapon5': '0.056', 'WEAPON5': '0.100', 'HITCOUNT': '0.110', 'AMMO3': '0.126', 'DAMAGECOUNT': '0.450', 'WEAPON3': '0.750', 'weapon2': '1.832', 'weapon3': '1.858', 'FRAGCOUNT': '3.000'} [2024-08-05 07:49:29,179][00139] DAMAGECOUNT value on done: 113668.0 [2024-08-05 07:49:29,179][00139] Sum rewards: -1.230, reward structure: {'DEATHCOUNT': '-11.250', 'HEALTH': '-1.414', 'AMMO2': '0.011', 'AMMO5': '0.012', 'AMMO4': '0.054', 'weapon4': '0.060', 'ARMOR': '0.080', 'WEAPON4': '0.100', 'weapon5': '0.102', 'AMMO3': '0.175', 'WEAPON5': '0.200', 'HITCOUNT': '0.380', 'WEAPON3': '1.050', 'weapon2': '1.512', 'DAMAGECOUNT': '1.605', 'weapon3': '2.092', 'FRAGCOUNT': '4.000'} [2024-08-05 07:49:30,483][00034] Fps is (10 sec: 1638.4, 60 sec: 1228.8, 300 sec: 1138.5). Total num frames: 8617984. Throughput: 0: 283.1. Samples: 2154328. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 07:49:30,485][00034] Avg episode reward: [(0, '-2.604')] [2024-08-05 07:49:35,483][00034] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1110.8). Total num frames: 8617984. Throughput: 0: 283.2. Samples: 2156022. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 07:49:35,484][00034] Avg episode reward: [(0, '-2.604')] [2024-08-05 07:49:35,493][00132] Saving /kaggle/working/train_dir/default_experiment/checkpoint_p0/checkpoint_000001052_8617984.pth... [2024-08-05 07:49:35,570][00132] Removing /kaggle/working/train_dir/default_experiment/checkpoint_p0/checkpoint_000001020_8355840.pth [2024-08-05 07:49:40,483][00034] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1110.8). Total num frames: 8626176. Throughput: 0: 281.8. Samples: 2157680. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 07:49:40,485][00034] Avg episode reward: [(0, '-2.604')] [2024-08-05 07:49:43,835][00139] DAMAGECOUNT value on done: 107488.0 [2024-08-05 07:49:44,063][00139] DAMAGECOUNT value on done: 114271.0 [2024-08-05 07:49:44,064][00139] Sum rewards: -0.080, reward structure: {'DEATHCOUNT': '-11.250', 'HEALTH': '-2.575', 'AMMO4': '-0.036', 'AMMO2': '-0.007', 'weapon5': '0.006', 'AMMO5': '0.010', 'ARMOR': '0.044', 'AMMO3': '0.171', 'WEAPON5': '0.200', 'HITCOUNT': '0.480', 'WEAPON3': '1.250', 'weapon2': '1.264', 'DAMAGECOUNT': '1.809', 'weapon3': '2.554', 'FRAGCOUNT': '6.000'} [2024-08-05 07:49:45,483][00034] Fps is (10 sec: 1638.4, 60 sec: 1092.3, 300 sec: 1138.5). Total num frames: 8634368. Throughput: 0: 281.8. Samples: 2158543. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 07:49:45,484][00034] Avg episode reward: [(0, '-2.646')] [2024-08-05 07:49:50,483][00034] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1110.8). Total num frames: 8634368. Throughput: 0: 282.6. Samples: 2160230. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 07:49:50,485][00034] Avg episode reward: [(0, '-2.646')] [2024-08-05 07:49:55,483][00034] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1110.8). Total num frames: 8642560. Throughput: 0: 283.3. Samples: 2161981. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 07:49:55,484][00034] Avg episode reward: [(0, '-2.646')] [2024-08-05 07:49:58,476][00139] DAMAGECOUNT value on done: 107750.0 [2024-08-05 07:49:58,477][00139] Sum rewards: -5.807, reward structure: {'DEATHCOUNT': '-11.250', 'HEALTH': '-0.724', 'FRAGCOUNT': '-0.500', 'AMMO2': '0.018', 'weapon7': '0.024', 'AMMO5': '0.026', 'weapon5': '0.088', 'AMMO4': '0.092', 'AMMO6': '0.100', 'WEAPON7': '0.100', 'AMMO7': '0.100', 'AMMO3': '0.135', 'HITCOUNT': '0.240', 'WEAPON5': '0.350', 'ARMOR': '0.400', 'DAMAGECOUNT': '0.786', 'WEAPON3': '0.850', 'weapon2': '1.338', 'weapon3': '2.020'} [2024-08-05 07:49:58,715][00139] DAMAGECOUNT value on done: 114419.0 [2024-08-05 07:49:58,716][00139] Sum rewards: -4.034, reward structure: {'DEATHCOUNT': '-9.000', 'HEALTH': '-1.695', 'AMMO2': '0.006', 'AMMO5': '0.020', 'AMMO4': '0.028', 'ARMOR': '0.036', 'HITCOUNT': '0.070', 'AMMO3': '0.123', 'weapon5': '0.134', 'WEAPON5': '0.400', 'DAMAGECOUNT': '0.444', 'WEAPON3': '0.850', 'FRAGCOUNT': '1.000', 'weapon2': '1.696', 'weapon3': '1.854'} [2024-08-05 07:50:00,483][00034] Fps is (10 sec: 1638.4, 60 sec: 1092.3, 300 sec: 1138.5). Total num frames: 8650752. Throughput: 0: 282.9. Samples: 2162841. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 07:50:00,484][00034] Avg episode reward: [(0, '-2.814')] [2024-08-05 07:50:05,483][00034] Fps is (10 sec: 1638.4, 60 sec: 1228.8, 300 sec: 1138.5). Total num frames: 8658944. Throughput: 0: 281.9. Samples: 2164508. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 07:50:05,484][00034] Avg episode reward: [(0, '-2.814')] [2024-08-05 07:50:10,483][00034] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1110.8). Total num frames: 8658944. Throughput: 0: 280.4. Samples: 2166138. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 07:50:10,486][00034] Avg episode reward: [(0, '-2.814')] [2024-08-05 07:50:13,600][00139] DAMAGECOUNT value on done: 108089.0 [2024-08-05 07:50:13,601][00139] Sum rewards: -3.896, reward structure: {'DEATHCOUNT': '-9.750', 'HEALTH': '-1.308', 'AMMO2': '0.010', 'AMMO5': '0.015', 'AMMO4': '0.050', 'AMMO3': '0.129', 'weapon5': '0.168', 'HITCOUNT': '0.190', 'WEAPON5': '0.300', 'ARMOR': '0.404', 'FRAGCOUNT': '0.500', 'WEAPON3': '0.850', 'DAMAGECOUNT': '1.017', 'weapon2': '1.646', 'weapon3': '1.882'} [2024-08-05 07:50:13,836][00139] DAMAGECOUNT value on done: 114738.0 [2024-08-05 07:50:13,836][00139] Sum rewards: -2.573, reward structure: {'DEATHCOUNT': '-6.750', 'FRAGCOUNT': '-1.000', 'HEALTH': '-0.480', 'AMMO2': '0.008', 'AMMO5': '0.010', 'AMMO4': '0.042', 'weapon5': '0.086', 'WEAPON4': '0.100', 'weapon4': '0.110', 'AMMO3': '0.112', 'WEAPON5': '0.200', 'HITCOUNT': '0.300', 'WEAPON3': '0.600', 'DAMAGECOUNT': '0.957', 'weapon3': '1.536', 'weapon2': '1.596'} [2024-08-05 07:50:15,483][00034] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1110.8). Total num frames: 8667136. Throughput: 0: 281.5. Samples: 2166997. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 07:50:15,485][00034] Avg episode reward: [(0, '-2.845')] [2024-08-05 07:50:20,483][00034] Fps is (10 sec: 1638.4, 60 sec: 1228.8, 300 sec: 1138.5). Total num frames: 8675328. Throughput: 0: 281.6. Samples: 2168696. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 07:50:20,484][00034] Avg episode reward: [(0, '-2.845')] [2024-08-05 07:50:25,483][00034] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1110.8). Total num frames: 8675328. Throughput: 0: 283.7. Samples: 2170445. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 07:50:25,485][00034] Avg episode reward: [(0, '-2.845')] [2024-08-05 07:50:26,730][00138] Updated weights for policy 0, policy_version 1060 (0.0018) [2024-08-05 07:50:28,249][00139] DAMAGECOUNT value on done: 108494.0 [2024-08-05 07:50:28,249][00139] Sum rewards: 1.166, reward structure: {'DEATHCOUNT': '-6.750', 'HEALTH': '-0.976', 'AMMO4': '-0.006', 'AMMO2': '-0.001', 'AMMO5': '0.016', 'WEAPON1': '0.020', 'ARMOR': '0.028', 'weapon7': '0.030', 'WEAPON4': '0.100', 'AMMO3': '0.119', 'AMMO6': '0.160', 'AMMO7': '0.160', 'WEAPON7': '0.200', 'weapon5': '0.206', 'WEAPON5': '0.300', 'HITCOUNT': '0.320', 'WEAPON3': '0.600', 'DAMAGECOUNT': '1.215', 'weapon3': '1.610', 'weapon2': '1.814', 'FRAGCOUNT': '2.000'} [2024-08-05 07:50:28,471][00139] DAMAGECOUNT value on done: 115029.0 [2024-08-05 07:50:28,472][00139] Sum rewards: 1.987, reward structure: {'DEATHCOUNT': '-6.750', 'AMMO5': '0.003', 'AMMO2': '0.012', 'WEAPON5': '0.050', 'weapon5': '0.058', 'AMMO4': '0.061', 'WEAPON4': '0.100', 'weapon4': '0.102', 'AMMO3': '0.125', 'HEALTH': '0.136', 'HITCOUNT': '0.300', 'ARMOR': '0.517', 'WEAPON3': '0.600', 'DAMAGECOUNT': '0.873', 'weapon2': '1.346', 'weapon3': '1.954', 'FRAGCOUNT': '2.500'} [2024-08-05 07:50:30,483][00034] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1110.8). Total num frames: 8683520. Throughput: 0: 283.6. Samples: 2171303. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 07:50:30,485][00034] Avg episode reward: [(0, '-2.751')] [2024-08-05 07:50:35,483][00034] Fps is (10 sec: 1638.4, 60 sec: 1228.8, 300 sec: 1138.5). Total num frames: 8691712. Throughput: 0: 283.9. Samples: 2173004. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 07:50:35,484][00034] Avg episode reward: [(0, '-2.751')] [2024-08-05 07:50:40,483][00034] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1110.8). Total num frames: 8691712. Throughput: 0: 281.5. Samples: 2174649. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 07:50:40,484][00034] Avg episode reward: [(0, '-2.751')] [2024-08-05 07:50:43,335][00139] DAMAGECOUNT value on done: 108746.0 [2024-08-05 07:50:43,561][00139] DAMAGECOUNT value on done: 115508.0 [2024-08-05 07:50:43,561][00139] Sum rewards: 4.330, reward structure: {'DEATHCOUNT': '-9.000', 'AMMO5': '0.007', 'AMMO2': '0.022', 'ARMOR': '0.040', 'WEAPON5': '0.100', 'AMMO4': '0.109', 'AMMO3': '0.125', 'weapon5': '0.160', 'HITCOUNT': '0.310', 'WEAPON3': '0.600', 'HEALTH': '0.768', 'DAMAGECOUNT': '1.437', 'weapon2': '1.804', 'weapon3': '1.848', 'FRAGCOUNT': '6.000'} [2024-08-05 07:50:45,483][00034] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1110.8). Total num frames: 8699904. Throughput: 0: 280.8. Samples: 2175476. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 07:50:45,485][00034] Avg episode reward: [(0, '-2.692')] [2024-08-05 07:50:50,483][00034] Fps is (10 sec: 1638.4, 60 sec: 1228.8, 300 sec: 1138.5). Total num frames: 8708096. Throughput: 0: 281.3. Samples: 2177165. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 07:50:50,484][00034] Avg episode reward: [(0, '-2.692')] [2024-08-05 07:50:55,483][00034] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1110.8). Total num frames: 8708096. Throughput: 0: 284.1. Samples: 2178921. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 07:50:55,485][00034] Avg episode reward: [(0, '-2.692')] [2024-08-05 07:50:57,965][00139] DAMAGECOUNT value on done: 108853.0 [2024-08-05 07:50:58,198][00139] DAMAGECOUNT value on done: 116167.0 [2024-08-05 07:50:58,198][00139] Sum rewards: 3.215, reward structure: {'DEATHCOUNT': '-6.750', 'AMMO2': '0.011', 'AMMO5': '0.015', 'ARMOR': '0.040', 'AMMO4': '0.057', 'HEALTH': '0.074', 'AMMO3': '0.107', 'WEAPON5': '0.250', 'weapon5': '0.266', 'HITCOUNT': '0.350', 'WEAPON3': '0.600', 'weapon3': '1.602', 'weapon2': '1.616', 'DAMAGECOUNT': '1.977', 'FRAGCOUNT': '3.000'} [2024-08-05 07:51:00,483][00034] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1110.8). Total num frames: 8716288. Throughput: 0: 284.1. Samples: 2179780. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 07:51:00,485][00034] Avg episode reward: [(0, '-2.755')] [2024-08-05 07:51:05,483][00034] Fps is (10 sec: 1638.4, 60 sec: 1092.3, 300 sec: 1138.5). Total num frames: 8724480. Throughput: 0: 283.5. Samples: 2181455. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 07:51:05,484][00034] Avg episode reward: [(0, '-2.755')] [2024-08-05 07:51:10,483][00034] Fps is (10 sec: 1638.4, 60 sec: 1228.8, 300 sec: 1138.5). Total num frames: 8732672. Throughput: 0: 282.1. Samples: 2183139. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 07:51:10,484][00034] Avg episode reward: [(0, '-2.755')] [2024-08-05 07:51:12,968][00139] DAMAGECOUNT value on done: 108963.0 [2024-08-05 07:51:12,968][00139] Sum rewards: -7.837, reward structure: {'DEATHCOUNT': '-9.750', 'FRAGCOUNT': '-1.500', 'HEALTH': '-1.406', 'AMMO4': '-0.036', 'AMMO2': '-0.007', 'AMMO5': '0.012', 'ARMOR': '0.016', 'HITCOUNT': '0.100', 'AMMO3': '0.105', 'weapon5': '0.112', 'WEAPON5': '0.300', 'DAMAGECOUNT': '0.330', 'WEAPON3': '0.600', 'weapon3': '1.460', 'weapon2': '1.826'} [2024-08-05 07:51:13,205][00139] DAMAGECOUNT value on done: 116367.0 [2024-08-05 07:51:13,205][00139] Sum rewards: 0.286, reward structure: {'DEATHCOUNT': '-5.250', 'HEALTH': '-0.334', 'AMMO4': '-0.002', 'AMMO2': '-0.000', 'AMMO5': '0.003', 'weapon5': '0.020', 'ARMOR': '0.035', 'WEAPON5': '0.050', 'AMMO3': '0.108', 'HITCOUNT': '0.180', 'WEAPON3': '0.350', 'DAMAGECOUNT': '0.600', 'weapon3': '1.218', 'weapon2': '1.308', 'FRAGCOUNT': '2.000'} [2024-08-05 07:51:15,483][00034] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1110.8). Total num frames: 8732672. Throughput: 0: 281.7. Samples: 2183979. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 07:51:15,484][00034] Avg episode reward: [(0, '-2.839')] [2024-08-05 07:51:20,483][00034] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1138.5). Total num frames: 8740864. Throughput: 0: 282.4. Samples: 2185710. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 07:51:20,484][00034] Avg episode reward: [(0, '-2.839')] [2024-08-05 07:51:25,483][00034] Fps is (10 sec: 1638.4, 60 sec: 1228.8, 300 sec: 1138.5). Total num frames: 8749056. Throughput: 0: 284.7. Samples: 2187462. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 07:51:25,485][00034] Avg episode reward: [(0, '-2.839')] [2024-08-05 07:51:27,371][00139] DAMAGECOUNT value on done: 109404.0 [2024-08-05 07:51:27,372][00139] Sum rewards: -4.955, reward structure: {'DEATHCOUNT': '-12.750', 'HEALTH': '-2.322', 'AMMO2': '0.003', 'AMMO5': '0.014', 'AMMO4': '0.017', 'ARMOR': '0.068', 'weapon5': '0.132', 'weapon4': '0.174', 'AMMO3': '0.186', 'WEAPON4': '0.200', 'WEAPON5': '0.300', 'HITCOUNT': '0.370', 'WEAPON3': '0.900', 'DAMAGECOUNT': '1.323', 'weapon2': '1.656', 'weapon3': '1.772', 'FRAGCOUNT': '3.000'} [2024-08-05 07:51:27,589][00139] DAMAGECOUNT value on done: 116562.0 [2024-08-05 07:51:27,590][00139] Sum rewards: -1.332, reward structure: {'DEATHCOUNT': '-6.750', 'HEALTH': '-1.115', 'AMMO2': '0.004', 'AMMO4': '0.018', 'WEAPON4': '0.100', 'ARMOR': '0.112', 'AMMO3': '0.118', 'weapon4': '0.118', 'HITCOUNT': '0.150', 'DAMAGECOUNT': '0.585', 'WEAPON3': '0.800', 'FRAGCOUNT': '1.000', 'weapon2': '1.710', 'weapon3': '1.818'} [2024-08-05 07:51:30,483][00034] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1110.8). Total num frames: 8749056. Throughput: 0: 286.1. Samples: 2188352. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 07:51:30,484][00034] Avg episode reward: [(0, '-2.878')] [2024-08-05 07:51:35,483][00034] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1138.5). Total num frames: 8757248. Throughput: 0: 285.8. Samples: 2190025. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 07:51:35,484][00034] Avg episode reward: [(0, '-2.878')] [2024-08-05 07:51:35,495][00132] Saving /kaggle/working/train_dir/default_experiment/checkpoint_p0/checkpoint_000001069_8757248.pth... [2024-08-05 07:51:35,570][00132] Removing /kaggle/working/train_dir/default_experiment/checkpoint_p0/checkpoint_000001036_8486912.pth [2024-08-05 07:51:38,783][00138] Updated weights for policy 0, policy_version 1070 (0.0017) [2024-08-05 07:51:40,483][00034] Fps is (10 sec: 1638.4, 60 sec: 1228.8, 300 sec: 1138.5). Total num frames: 8765440. Throughput: 0: 285.0. Samples: 2191746. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 07:51:40,486][00034] Avg episode reward: [(0, '-2.878')] [2024-08-05 07:51:42,366][00139] DAMAGECOUNT value on done: 110150.0 [2024-08-05 07:51:42,367][00139] Sum rewards: 2.181, reward structure: {'DEATHCOUNT': '-9.750', 'HEALTH': '-0.588', 'WEAPON1': '0.010', 'AMMO2': '0.016', 'AMMO5': '0.020', 'ARMOR': '0.024', 'AMMO4': '0.078', 'weapon5': '0.114', 'AMMO3': '0.183', 'WEAPON5': '0.250', 'HITCOUNT': '0.510', 'WEAPON3': '0.950', 'weapon3': '1.782', 'weapon2': '1.844', 'DAMAGECOUNT': '2.238', 'FRAGCOUNT': '4.500'} [2024-08-05 07:51:42,589][00139] DAMAGECOUNT value on done: 116951.0 [2024-08-05 07:51:42,589][00139] Sum rewards: -5.380, reward structure: {'DEATHCOUNT': '-9.750', 'HEALTH': '-1.866', 'FRAGCOUNT': '-0.500', 'AMMO2': '0.007', 'AMMO5': '0.013', 'WEAPON1': '0.020', 'AMMO4': '0.033', 'WEAPON4': '0.050', 'ARMOR': '0.093', 'weapon5': '0.100', 'weapon4': '0.136', 'AMMO3': '0.202', 'WEAPON5': '0.300', 'HITCOUNT': '0.300', 'WEAPON3': '1.100', 'DAMAGECOUNT': '1.167', 'weapon3': '1.594', 'weapon2': '1.622'} [2024-08-05 07:51:45,483][00034] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1110.8). Total num frames: 8765440. Throughput: 0: 283.7. Samples: 2192547. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 07:51:45,484][00034] Avg episode reward: [(0, '-2.797')] [2024-08-05 07:51:50,483][00034] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1110.8). Total num frames: 8773632. Throughput: 0: 284.3. Samples: 2194248. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 07:51:50,486][00034] Avg episode reward: [(0, '-2.797')] [2024-08-05 07:51:55,483][00034] Fps is (10 sec: 1638.4, 60 sec: 1228.8, 300 sec: 1138.5). Total num frames: 8781824. Throughput: 0: 284.8. Samples: 2195954. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 07:51:55,485][00034] Avg episode reward: [(0, '-2.797')] [2024-08-05 07:51:57,175][00139] DAMAGECOUNT value on done: 110245.0 [2024-08-05 07:51:57,176][00139] Sum rewards: -3.071, reward structure: {'DEATHCOUNT': '-7.500', 'HEALTH': '-1.180', 'AMMO4': '-0.017', 'AMMO2': '-0.003', 'AMMO5': '0.015', 'ARMOR': '0.020', 'weapon5': '0.088', 'HITCOUNT': '0.090', 'AMMO3': '0.119', 'WEAPON5': '0.200', 'DAMAGECOUNT': '0.285', 'WEAPON3': '0.650', 'FRAGCOUNT': '1.000', 'weapon2': '1.414', 'weapon3': '1.748'} [2024-08-05 07:51:57,417][00139] DAMAGECOUNT value on done: 117232.0 [2024-08-05 07:51:57,418][00139] Sum rewards: -7.144, reward structure: {'DEATHCOUNT': '-12.750', 'HEALTH': '-1.702', 'AMMO5': '0.005', 'AMMO2': '0.010', 'ARMOR': '0.040', 'AMMO4': '0.048', 'weapon5': '0.068', 'WEAPON5': '0.150', 'AMMO3': '0.176', 'HITCOUNT': '0.220', 'DAMAGECOUNT': '0.831', 'FRAGCOUNT': '1.000', 'WEAPON3': '1.050', 'weapon3': '1.842', 'weapon2': '1.868'} [2024-08-05 07:52:00,483][00034] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1110.8). Total num frames: 8781824. Throughput: 0: 285.0. Samples: 2196805. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 07:52:00,485][00034] Avg episode reward: [(0, '-2.853')] [2024-08-05 07:52:05,483][00034] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1110.8). Total num frames: 8790016. Throughput: 0: 284.7. Samples: 2198520. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 07:52:05,485][00034] Avg episode reward: [(0, '-2.853')] [2024-08-05 07:52:10,483][00034] Fps is (10 sec: 1638.4, 60 sec: 1092.3, 300 sec: 1138.5). Total num frames: 8798208. Throughput: 0: 283.0. Samples: 2200196. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 07:52:10,484][00034] Avg episode reward: [(0, '-2.853')] [2024-08-05 07:52:12,050][00139] DAMAGECOUNT value on done: 110501.0 [2024-08-05 07:52:12,051][00139] Sum rewards: -7.427, reward structure: {'DEATHCOUNT': '-12.000', 'HEALTH': '-3.198', 'AMMO4': '-0.026', 'AMMO2': '-0.005', 'AMMO5': '0.015', 'weapon5': '0.054', 'ARMOR': '0.056', 'AMMO3': '0.225', 'HITCOUNT': '0.230', 'WEAPON5': '0.350', 'DAMAGECOUNT': '0.768', 'FRAGCOUNT': '1.000', 'weapon2': '1.316', 'WEAPON3': '1.450', 'weapon3': '2.338'} [2024-08-05 07:52:12,343][00139] DAMAGECOUNT value on done: 117440.0 [2024-08-05 07:52:12,344][00139] Sum rewards: -2.758, reward structure: {'DEATHCOUNT': '-7.500', 'HEALTH': '-1.040', 'AMMO4': '-0.007', 'AMMO2': '-0.001', 'AMMO5': '0.007', 'ARMOR': '0.040', 'WEAPON4': '0.050', 'WEAPON5': '0.050', 'weapon5': '0.078', 'HITCOUNT': '0.110', 'AMMO3': '0.126', 'weapon4': '0.192', 'FRAGCOUNT': '0.500', 'DAMAGECOUNT': '0.624', 'WEAPON3': '0.650', 'weapon3': '1.506', 'weapon2': '1.856'} [2024-08-05 07:52:15,483][00034] Fps is (10 sec: 1638.4, 60 sec: 1228.8, 300 sec: 1138.5). Total num frames: 8806400. Throughput: 0: 280.9. Samples: 2200993. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 07:52:15,484][00034] Avg episode reward: [(0, '-2.914')] [2024-08-05 07:52:20,483][00034] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1110.8). Total num frames: 8806400. Throughput: 0: 280.7. Samples: 2202657. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 07:52:20,485][00034] Avg episode reward: [(0, '-2.914')] [2024-08-05 07:52:25,483][00034] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1138.5). Total num frames: 8814592. Throughput: 0: 281.6. Samples: 2204418. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 07:52:25,484][00034] Avg episode reward: [(0, '-2.914')] [2024-08-05 07:52:26,925][00139] DAMAGECOUNT value on done: 110901.0 [2024-08-05 07:52:26,926][00139] Sum rewards: -10.009, reward structure: {'DEATHCOUNT': '-12.000', 'HEALTH': '-2.836', 'FRAGCOUNT': '-2.000', 'AMMO5': '0.008', 'AMMO2': '0.011', 'ARMOR': '0.056', 'AMMO4': '0.056', 'weapon5': '0.078', 'WEAPON4': '0.150', 'AMMO3': '0.157', 'weapon4': '0.190', 'WEAPON5': '0.200', 'HITCOUNT': '0.340', 'WEAPON3': '0.950', 'DAMAGECOUNT': '1.200', 'weapon2': '1.304', 'weapon3': '2.126'} [2024-08-05 07:52:27,160][00139] DAMAGECOUNT value on done: 117776.0 [2024-08-05 07:52:27,161][00139] Sum rewards: -3.845, reward structure: {'DEATHCOUNT': '-9.750', 'HEALTH': '-2.520', 'AMMO4': '-0.041', 'AMMO2': '-0.008', 'AMMO5': '0.005', 'weapon5': '0.048', 'ARMOR': '0.060', 'WEAPON5': '0.150', 'AMMO3': '0.179', 'HITCOUNT': '0.290', 'DAMAGECOUNT': '1.008', 'WEAPON3': '1.150', 'weapon2': '1.376', 'FRAGCOUNT': '2.000', 'weapon3': '2.208'} [2024-08-05 07:52:30,483][00034] Fps is (10 sec: 1638.4, 60 sec: 1228.8, 300 sec: 1138.5). Total num frames: 8822784. Throughput: 0: 283.0. Samples: 2205283. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 07:52:30,485][00034] Avg episode reward: [(0, '-2.979')] [2024-08-05 07:52:35,483][00034] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1110.8). Total num frames: 8822784. Throughput: 0: 282.8. Samples: 2206976. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 07:52:35,485][00034] Avg episode reward: [(0, '-2.979')] [2024-08-05 07:52:40,483][00034] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1138.5). Total num frames: 8830976. Throughput: 0: 282.6. Samples: 2208669. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 07:52:40,484][00034] Avg episode reward: [(0, '-2.979')] [2024-08-05 07:52:41,726][00139] DAMAGECOUNT value on done: 111391.0 [2024-08-05 07:52:41,727][00139] Sum rewards: -5.168, reward structure: {'DEATHCOUNT': '-12.000', 'HEALTH': '-0.900', 'AMMO5': '0.003', 'AMMO2': '0.004', 'AMMO4': '0.020', 'WEAPON1': '0.020', 'ARMOR': '0.032', 'weapon5': '0.080', 'WEAPON5': '0.100', 'AMMO3': '0.127', 'HITCOUNT': '0.240', 'WEAPON3': '0.750', 'DAMAGECOUNT': '1.470', 'FRAGCOUNT': '1.500', 'weapon3': '1.584', 'weapon2': '1.802'} [2024-08-05 07:52:41,964][00139] DAMAGECOUNT value on done: 118271.0 [2024-08-05 07:52:41,965][00139] Sum rewards: -1.894, reward structure: {'DEATHCOUNT': '-12.000', 'HEALTH': '-1.164', 'AMMO4': '-0.029', 'AMMO2': '-0.006', 'AMMO5': '0.014', 'weapon7': '0.018', 'AMMO6': '0.100', 'WEAPON7': '0.100', 'AMMO7': '0.100', 'weapon5': '0.116', 'AMMO3': '0.211', 'HITCOUNT': '0.280', 'WEAPON5': '0.350', 'WEAPON3': '1.000', 'weapon2': '1.394', 'DAMAGECOUNT': '1.485', 'weapon3': '2.136', 'FRAGCOUNT': '4.000'} [2024-08-05 07:52:45,483][00034] Fps is (10 sec: 1638.4, 60 sec: 1228.8, 300 sec: 1138.5). Total num frames: 8839168. Throughput: 0: 282.8. Samples: 2209532. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 07:52:45,484][00034] Avg episode reward: [(0, '-3.051')] [2024-08-05 07:52:50,483][00034] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1110.8). Total num frames: 8839168. Throughput: 0: 282.0. Samples: 2211209. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 07:52:50,484][00034] Avg episode reward: [(0, '-3.051')] [2024-08-05 07:52:51,440][00138] Updated weights for policy 0, policy_version 1080 (0.0018) [2024-08-05 07:52:55,483][00034] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1138.5). Total num frames: 8847360. Throughput: 0: 282.6. Samples: 2212915. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 07:52:55,484][00034] Avg episode reward: [(0, '-3.051')] [2024-08-05 07:52:56,619][00139] DAMAGECOUNT value on done: 111548.0 [2024-08-05 07:52:56,620][00139] Sum rewards: -5.658, reward structure: {'DEATHCOUNT': '-9.000', 'HEALTH': '-1.771', 'FRAGCOUNT': '-0.500', 'AMMO4': '-0.036', 'AMMO2': '-0.007', 'ARMOR': '0.004', 'AMMO5': '0.016', 'weapon5': '0.118', 'AMMO3': '0.123', 'HITCOUNT': '0.150', 'WEAPON5': '0.250', 'DAMAGECOUNT': '0.471', 'WEAPON3': '0.800', 'weapon2': '1.658', 'weapon3': '2.066'} [2024-08-05 07:52:56,870][00139] DAMAGECOUNT value on done: 118506.0 [2024-08-05 07:52:56,871][00139] Sum rewards: -2.147, reward structure: {'DEATHCOUNT': '-8.250', 'HEALTH': '-1.887', 'AMMO5': '0.007', 'AMMO2': '0.012', 'WEAPON5': '0.050', 'WEAPON4': '0.050', 'AMMO4': '0.061', 'weapon5': '0.118', 'AMMO3': '0.133', 'ARMOR': '0.135', 'HITCOUNT': '0.160', 'DAMAGECOUNT': '0.705', 'WEAPON3': '0.900', 'weapon2': '1.676', 'weapon3': '1.982', 'FRAGCOUNT': '2.000'} [2024-08-05 07:53:00,483][00034] Fps is (10 sec: 1638.4, 60 sec: 1228.8, 300 sec: 1138.5). Total num frames: 8855552. Throughput: 0: 284.1. Samples: 2213777. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 07:53:00,485][00034] Avg episode reward: [(0, '-3.051')] [2024-08-05 07:53:05,483][00034] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1110.8). Total num frames: 8855552. Throughput: 0: 285.6. Samples: 2215510. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 07:53:05,484][00034] Avg episode reward: [(0, '-3.051')] [2024-08-05 07:53:10,483][00034] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1138.5). Total num frames: 8863744. Throughput: 0: 283.9. Samples: 2217193. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 07:53:10,485][00034] Avg episode reward: [(0, '-3.051')] [2024-08-05 07:53:11,336][00139] DAMAGECOUNT value on done: 111768.0 [2024-08-05 07:53:11,337][00139] Sum rewards: -0.371, reward structure: {'DEATHCOUNT': '-7.500', 'AMMO5': '0.007', 'AMMO2': '0.011', 'weapon5': '0.014', 'WEAPON1': '0.020', 'AMMO4': '0.056', 'AMMO3': '0.121', 'ARMOR': '0.140', 'WEAPON5': '0.150', 'HITCOUNT': '0.180', 'HEALTH': '0.453', 'WEAPON3': '0.650', 'DAMAGECOUNT': '0.660', 'FRAGCOUNT': '1.000', 'weapon2': '1.656', 'weapon3': '2.010'} [2024-08-05 07:53:11,557][00139] DAMAGECOUNT value on done: 118833.0 [2024-08-05 07:53:11,558][00139] Sum rewards: -5.409, reward structure: {'DEATHCOUNT': '-9.750', 'FRAGCOUNT': '-2.000', 'ARMOR': '0.004', 'AMMO5': '0.012', 'AMMO2': '0.017', 'weapon5': '0.040', 'AMMO4': '0.085', 'weapon4': '0.098', 'WEAPON4': '0.100', 'WEAPON5': '0.150', 'AMMO3': '0.151', 'HEALTH': '0.268', 'HITCOUNT': '0.270', 'WEAPON3': '0.700', 'DAMAGECOUNT': '0.981', 'weapon2': '1.194', 'weapon3': '2.270'} [2024-08-05 07:53:15,483][00034] Fps is (10 sec: 1638.4, 60 sec: 1092.3, 300 sec: 1138.5). Total num frames: 8871936. Throughput: 0: 283.3. Samples: 2218032. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 07:53:15,485][00034] Avg episode reward: [(0, '-2.977')] [2024-08-05 07:53:20,483][00034] Fps is (10 sec: 1638.4, 60 sec: 1228.8, 300 sec: 1138.5). Total num frames: 8880128. Throughput: 0: 283.1. Samples: 2219717. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 07:53:20,484][00034] Avg episode reward: [(0, '-2.977')] [2024-08-05 07:53:25,483][00034] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1138.5). Total num frames: 8880128. Throughput: 0: 283.6. Samples: 2221429. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 07:53:25,485][00034] Avg episode reward: [(0, '-2.977')] [2024-08-05 07:53:26,230][00139] DAMAGECOUNT value on done: 111858.0 [2024-08-05 07:53:26,470][00139] DAMAGECOUNT value on done: 119082.0 [2024-08-05 07:53:26,471][00139] Sum rewards: -5.711, reward structure: {'DEATHCOUNT': '-10.500', 'HEALTH': '-1.792', 'AMMO5': '0.009', 'AMMO2': '0.010', 'AMMO4': '0.047', 'weapon5': '0.050', 'AMMO3': '0.160', 'HITCOUNT': '0.190', 'WEAPON5': '0.200', 'FRAGCOUNT': '0.500', 'DAMAGECOUNT': '0.747', 'WEAPON3': '0.900', 'weapon3': '1.880', 'weapon2': '1.888'} [2024-08-05 07:53:30,483][00034] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1138.5). Total num frames: 8888320. Throughput: 0: 283.0. Samples: 2222266. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 07:53:30,485][00034] Avg episode reward: [(0, '-2.947')] [2024-08-05 07:53:35,483][00034] Fps is (10 sec: 1638.4, 60 sec: 1228.8, 300 sec: 1138.5). Total num frames: 8896512. Throughput: 0: 283.8. Samples: 2223980. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 07:53:35,484][00034] Avg episode reward: [(0, '-2.947')] [2024-08-05 07:53:35,493][00132] Saving /kaggle/working/train_dir/default_experiment/checkpoint_p0/checkpoint_000001086_8896512.pth... [2024-08-05 07:53:35,578][00132] Removing /kaggle/working/train_dir/default_experiment/checkpoint_p0/checkpoint_000001052_8617984.pth [2024-08-05 07:53:40,483][00034] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1110.8). Total num frames: 8896512. Throughput: 0: 283.1. Samples: 2225656. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 07:53:40,484][00034] Avg episode reward: [(0, '-2.947')] [2024-08-05 07:53:41,102][00139] DAMAGECOUNT value on done: 112149.0 [2024-08-05 07:53:41,102][00139] Sum rewards: 2.277, reward structure: {'DEATHCOUNT': '-3.750', 'HEALTH': '-0.594', 'AMMO2': '0.004', 'AMMO4': '0.020', 'weapon7': '0.052', 'AMMO3': '0.068', 'AMMO6': '0.100', 'AMMO7': '0.100', 'WEAPON7': '0.100', 'HITCOUNT': '0.180', 'WEAPON3': '0.350', 'weapon3': '0.788', 'DAMAGECOUNT': '0.873', 'weapon2': '0.986', 'FRAGCOUNT': '3.000'} [2024-08-05 07:53:41,329][00139] DAMAGECOUNT value on done: 119310.0 [2024-08-05 07:53:41,329][00139] Sum rewards: -0.953, reward structure: {'DEATHCOUNT': '-7.500', 'HEALTH': '-0.092', 'AMMO5': '0.003', 'AMMO2': '0.004', 'AMMO4': '0.018', 'weapon5': '0.054', 'WEAPON5': '0.100', 'AMMO3': '0.137', 'HITCOUNT': '0.160', 'ARMOR': '0.412', 'DAMAGECOUNT': '0.684', 'WEAPON3': '0.750', 'FRAGCOUNT': '1.000', 'weapon2': '1.160', 'weapon3': '2.158'} [2024-08-05 07:53:45,483][00034] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1138.5). Total num frames: 8904704. Throughput: 0: 282.8. Samples: 2226503. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 07:53:45,484][00034] Avg episode reward: [(0, '-2.909')] [2024-08-05 07:53:50,483][00034] Fps is (10 sec: 1638.4, 60 sec: 1228.8, 300 sec: 1138.5). Total num frames: 8912896. Throughput: 0: 281.2. Samples: 2228162. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 07:53:50,484][00034] Avg episode reward: [(0, '-2.909')] [2024-08-05 07:53:54,548][00139] Large shaping reward -2.519 for [('FRAGCOUNT', -1.5, -1.0), ('DEATHCOUNT', -0.75, 1.0), ('HEALTH', -0.27, -90.0), ('AMMO5', -0.0005, -1.0), ('weapon5', 0.002)] [2024-08-05 07:53:55,483][00034] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1110.8). Total num frames: 8912896. Throughput: 0: 281.8. Samples: 2229876. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 07:53:55,486][00034] Avg episode reward: [(0, '-2.909')] [2024-08-05 07:53:55,972][00139] DAMAGECOUNT value on done: 112409.0 [2024-08-05 07:53:55,972][00139] Sum rewards: -5.218, reward structure: {'DEATHCOUNT': '-11.250', 'HEALTH': '-0.980', 'AMMO5': '0.010', 'AMMO2': '0.012', 'AMMO4': '0.059', 'AMMO3': '0.095', 'weapon5': '0.106', 'WEAPON5': '0.150', 'HITCOUNT': '0.260', 'WEAPON3': '0.550', 'DAMAGECOUNT': '0.780', 'FRAGCOUNT': '1.500', 'weapon3': '1.646', 'weapon2': '1.844'} [2024-08-05 07:53:56,206][00139] DAMAGECOUNT value on done: 119480.0 [2024-08-05 07:53:56,206][00139] Sum rewards: -1.787, reward structure: {'DEATHCOUNT': '-8.250', 'HEALTH': '-0.190', 'AMMO5': '0.005', 'AMMO2': '0.006', 'AMMO4': '0.029', 'WEAPON5': '0.050', 'weapon5': '0.104', 'AMMO3': '0.110', 'HITCOUNT': '0.160', 'ARMOR': '0.459', 'DAMAGECOUNT': '0.510', 'WEAPON3': '0.650', 'FRAGCOUNT': '1.000', 'weapon3': '1.606', 'weapon2': '1.964'} [2024-08-05 07:54:00,483][00034] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1138.5). Total num frames: 8921088. Throughput: 0: 282.3. Samples: 2230736. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 07:54:00,484][00034] Avg episode reward: [(0, '-2.930')] [2024-08-05 07:54:03,737][00138] Updated weights for policy 0, policy_version 1090 (0.0016) [2024-08-05 07:54:05,483][00034] Fps is (10 sec: 1638.4, 60 sec: 1228.8, 300 sec: 1138.5). Total num frames: 8929280. Throughput: 0: 282.7. Samples: 2232438. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 07:54:05,484][00034] Avg episode reward: [(0, '-2.930')] [2024-08-05 07:54:10,483][00034] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1110.8). Total num frames: 8929280. Throughput: 0: 281.6. Samples: 2234099. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 07:54:10,484][00034] Avg episode reward: [(0, '-2.930')] [2024-08-05 07:54:10,968][00139] DAMAGECOUNT value on done: 112684.0 [2024-08-05 07:54:10,968][00139] Sum rewards: -5.503, reward structure: {'DEATHCOUNT': '-10.500', 'HEALTH': '-2.598', 'AMMO5': '0.005', 'AMMO2': '0.006', 'AMMO4': '0.031', 'weapon5': '0.056', 'WEAPON5': '0.100', 'AMMO6': '0.100', 'WEAPON7': '0.100', 'AMMO7': '0.100', 'ARMOR': '0.112', 'WEAPON4': '0.150', 'AMMO3': '0.173', 'HITCOUNT': '0.240', 'weapon4': '0.250', 'DAMAGECOUNT': '0.825', 'FRAGCOUNT': '1.000', 'WEAPON3': '1.100', 'weapon2': '1.386', 'weapon3': '1.860'} [2024-08-05 07:54:11,258][00139] DAMAGECOUNT value on done: 119636.0 [2024-08-05 07:54:11,259][00139] Sum rewards: -1.749, reward structure: {'DEATHCOUNT': '-4.500', 'HEALTH': '-0.620', 'FRAGCOUNT': '-0.500', 'AMMO2': '0.002', 'AMMO5': '0.005', 'AMMO4': '0.010', 'weapon5': '0.034', 'WEAPON5': '0.050', 'WEAPON4': '0.050', 'weapon7': '0.064', 'AMMO6': '0.100', 'AMMO7': '0.100', 'WEAPON7': '0.100', 'AMMO3': '0.110', 'HITCOUNT': '0.110', 'weapon4': '0.220', 'WEAPON3': '0.350', 'DAMAGECOUNT': '0.468', 'weapon3': '0.790', 'weapon2': '1.308'} [2024-08-05 07:54:15,483][00034] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1138.5). Total num frames: 8937472. Throughput: 0: 281.7. Samples: 2234941. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 07:54:15,485][00034] Avg episode reward: [(0, '-2.869')] [2024-08-05 07:54:20,483][00034] Fps is (10 sec: 1638.4, 60 sec: 1092.3, 300 sec: 1138.5). Total num frames: 8945664. Throughput: 0: 280.4. Samples: 2236597. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 07:54:20,484][00034] Avg episode reward: [(0, '-2.869')] [2024-08-05 07:54:25,483][00034] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1110.8). Total num frames: 8945664. Throughput: 0: 280.4. Samples: 2238273. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 07:54:25,484][00034] Avg episode reward: [(0, '-2.869')] [2024-08-05 07:54:26,122][00139] DAMAGECOUNT value on done: 113247.0 [2024-08-05 07:54:26,123][00139] Sum rewards: -0.001, reward structure: {'DEATHCOUNT': '-9.000', 'HEALTH': '-2.605', 'AMMO4': '-0.003', 'AMMO2': '-0.001', 'weapon7': '0.008', 'AMMO5': '0.028', 'weapon5': '0.072', 'AMMO3': '0.139', 'AMMO6': '0.200', 'WEAPON7': '0.200', 'AMMO7': '0.200', 'HITCOUNT': '0.400', 'WEAPON5': '0.400', 'ARMOR': '0.524', 'WEAPON3': '1.100', 'weapon2': '1.298', 'DAMAGECOUNT': '1.689', 'weapon3': '2.350', 'FRAGCOUNT': '3.000'} [2024-08-05 07:54:26,374][00139] DAMAGECOUNT value on done: 119982.0 [2024-08-05 07:54:26,374][00139] Sum rewards: 0.582, reward structure: {'DEATHCOUNT': '-8.250', 'HEALTH': '-0.639', 'AMMO2': '0.000', 'AMMO4': '0.000', 'AMMO5': '0.017', 'ARMOR': '0.024', 'AMMO3': '0.123', 'weapon5': '0.256', 'HITCOUNT': '0.330', 'WEAPON5': '0.400', 'WEAPON3': '0.800', 'DAMAGECOUNT': '1.038', 'weapon2': '1.532', 'weapon3': '1.950', 'FRAGCOUNT': '3.000'} [2024-08-05 07:54:30,483][00034] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1138.5). Total num frames: 8953856. Throughput: 0: 280.5. Samples: 2239126. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 07:54:30,484][00034] Avg episode reward: [(0, '-2.827')] [2024-08-05 07:54:35,483][00034] Fps is (10 sec: 1638.4, 60 sec: 1092.3, 300 sec: 1138.5). Total num frames: 8962048. Throughput: 0: 281.6. Samples: 2240834. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 07:54:35,485][00034] Avg episode reward: [(0, '-2.827')] [2024-08-05 07:54:40,483][00034] Fps is (10 sec: 1638.4, 60 sec: 1228.8, 300 sec: 1138.6). Total num frames: 8970240. Throughput: 0: 282.0. Samples: 2242564. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 07:54:40,485][00034] Avg episode reward: [(0, '-2.827')] [2024-08-05 07:54:40,804][00139] DAMAGECOUNT value on done: 113357.0 [2024-08-05 07:54:41,063][00139] DAMAGECOUNT value on done: 120300.0 [2024-08-05 07:54:41,063][00139] Sum rewards: -4.531, reward structure: {'DEATHCOUNT': '-8.250', 'HEALTH': '-1.546', 'FRAGCOUNT': '-1.000', 'AMMO4': '-0.055', 'AMMO2': '-0.011', 'ARMOR': '0.004', 'AMMO5': '0.014', 'weapon5': '0.040', 'weapon7': '0.046', 'AMMO3': '0.117', 'AMMO6': '0.120', 'AMMO7': '0.120', 'WEAPON5': '0.200', 'WEAPON7': '0.200', 'HITCOUNT': '0.210', 'WEAPON3': '0.800', 'DAMAGECOUNT': '0.954', 'weapon2': '1.478', 'weapon3': '2.028'} [2024-08-05 07:54:45,483][00034] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1138.5). Total num frames: 8970240. Throughput: 0: 281.2. Samples: 2243392. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 07:54:45,484][00034] Avg episode reward: [(0, '-2.863')] [2024-08-05 07:54:50,483][00034] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1138.5). Total num frames: 8978432. Throughput: 0: 280.8. Samples: 2245072. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 07:54:50,486][00034] Avg episode reward: [(0, '-2.863')] [2024-08-05 07:54:55,483][00034] Fps is (10 sec: 1638.4, 60 sec: 1228.8, 300 sec: 1138.5). Total num frames: 8986624. Throughput: 0: 281.2. Samples: 2246752. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 07:54:55,485][00034] Avg episode reward: [(0, '-2.863')] [2024-08-05 07:54:55,776][00139] DAMAGECOUNT value on done: 113604.0 [2024-08-05 07:54:55,777][00139] Sum rewards: -5.479, reward structure: {'DEATHCOUNT': '-9.000', 'HEALTH': '-1.472', 'FRAGCOUNT': '-0.500', 'AMMO2': '0.003', 'AMMO5': '0.008', 'AMMO4': '0.014', 'weapon5': '0.042', 'WEAPON5': '0.100', 'AMMO3': '0.132', 'HITCOUNT': '0.200', 'ARMOR': '0.476', 'WEAPON3': '0.650', 'DAMAGECOUNT': '0.741', 'weapon2': '1.490', 'weapon3': '1.638'} [2024-08-05 07:54:56,000][00139] DAMAGECOUNT value on done: 120684.0 [2024-08-05 07:54:56,001][00139] Sum rewards: -1.107, reward structure: {'DEATHCOUNT': '-9.750', 'HEALTH': '-0.630', 'AMMO4': '-0.005', 'AMMO2': '-0.001', 'AMMO5': '0.005', 'ARMOR': '0.052', 'weapon5': '0.056', 'WEAPON5': '0.100', 'HITCOUNT': '0.170', 'AMMO3': '0.180', 'WEAPON3': '0.950', 'DAMAGECOUNT': '1.152', 'weapon2': '1.254', 'weapon3': '2.360', 'FRAGCOUNT': '3.000'} [2024-08-05 07:55:00,483][00034] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1110.8). Total num frames: 8986624. Throughput: 0: 281.6. Samples: 2247612. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 07:55:00,485][00034] Avg episode reward: [(0, '-2.867')] [2024-08-05 07:55:05,483][00034] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1138.5). Total num frames: 8994816. Throughput: 0: 282.2. Samples: 2249296. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 07:55:05,484][00034] Avg episode reward: [(0, '-2.867')] [2024-08-05 07:55:10,483][00034] Fps is (10 sec: 1638.4, 60 sec: 1228.8, 300 sec: 1138.5). Total num frames: 9003008. Throughput: 0: 282.5. Samples: 2250986. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 07:55:10,484][00034] Avg episode reward: [(0, '-2.867')] [2024-08-05 07:55:10,637][00139] DAMAGECOUNT value on done: 113819.0 [2024-08-05 07:55:10,637][00139] Sum rewards: -0.871, reward structure: {'DEATHCOUNT': '-6.750', 'HEALTH': '-1.971', 'AMMO4': '-0.041', 'AMMO2': '-0.008', 'AMMO5': '0.007', 'weapon4': '0.040', 'weapon7': '0.050', 'ARMOR': '0.052', 'weapon5': '0.098', 'WEAPON4': '0.100', 'AMMO3': '0.113', 'AMMO6': '0.120', 'AMMO7': '0.120', 'WEAPON7': '0.200', 'WEAPON5': '0.200', 'HITCOUNT': '0.200', 'DAMAGECOUNT': '0.645', 'WEAPON3': '0.900', 'FRAGCOUNT': '1.500', 'weapon2': '1.534', 'weapon3': '2.020'} [2024-08-05 07:55:10,857][00139] DAMAGECOUNT value on done: 120893.0 [2024-08-05 07:55:10,858][00139] Sum rewards: -5.519, reward structure: {'DEATHCOUNT': '-11.250', 'HEALTH': '-2.626', 'AMMO2': '0.004', 'AMMO5': '0.010', 'AMMO4': '0.020', 'WEAPON4': '0.050', 'WEAPON5': '0.050', 'weapon5': '0.066', 'HITCOUNT': '0.140', 'AMMO3': '0.191', 'weapon4': '0.218', 'ARMOR': '0.480', 'DAMAGECOUNT': '0.627', 'WEAPON3': '1.000', 'weapon3': '1.366', 'FRAGCOUNT': '2.000', 'weapon2': '2.134'} [2024-08-05 07:55:15,483][00034] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1110.8). Total num frames: 9003008. Throughput: 0: 283.2. Samples: 2251870. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 07:55:15,485][00034] Avg episode reward: [(0, '-2.835')] [2024-08-05 07:55:16,405][00138] Updated weights for policy 0, policy_version 1100 (0.0017) [2024-08-05 07:55:20,483][00034] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1138.6). Total num frames: 9011200. Throughput: 0: 283.1. Samples: 2253574. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 07:55:20,484][00034] Avg episode reward: [(0, '-2.835')] [2024-08-05 07:55:25,405][00139] DAMAGECOUNT value on done: 114029.0 [2024-08-05 07:55:25,483][00034] Fps is (10 sec: 1638.4, 60 sec: 1228.8, 300 sec: 1138.5). Total num frames: 9019392. Throughput: 0: 282.0. Samples: 2255256. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 07:55:25,484][00034] Avg episode reward: [(0, '-2.884')] [2024-08-05 07:55:25,648][00139] DAMAGECOUNT value on done: 121086.0 [2024-08-05 07:55:25,648][00139] Sum rewards: -3.088, reward structure: {'DEATHCOUNT': '-8.250', 'HEALTH': '-2.000', 'AMMO4': '-0.038', 'AMMO2': '-0.008', 'AMMO5': '0.015', 'weapon5': '0.086', 'WEAPON4': '0.100', 'weapon4': '0.146', 'AMMO3': '0.148', 'HITCOUNT': '0.160', 'WEAPON5': '0.300', 'DAMAGECOUNT': '0.579', 'WEAPON3': '0.750', 'weapon2': '1.322', 'weapon3': '1.602', 'FRAGCOUNT': '2.000'} [2024-08-05 07:55:30,483][00034] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1110.8). Total num frames: 9019392. Throughput: 0: 282.6. Samples: 2256109. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 07:55:30,486][00034] Avg episode reward: [(0, '-2.841')] [2024-08-05 07:55:35,483][00034] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1138.5). Total num frames: 9027584. Throughput: 0: 282.7. Samples: 2257795. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 07:55:35,484][00034] Avg episode reward: [(0, '-2.841')] [2024-08-05 07:55:35,494][00132] Saving /kaggle/working/train_dir/default_experiment/checkpoint_p0/checkpoint_000001102_9027584.pth... [2024-08-05 07:55:35,571][00132] Removing /kaggle/working/train_dir/default_experiment/checkpoint_p0/checkpoint_000001069_8757248.pth [2024-08-05 07:55:40,294][00139] DAMAGECOUNT value on done: 114295.0 [2024-08-05 07:55:40,295][00139] Sum rewards: -3.090, reward structure: {'DEATHCOUNT': '-8.250', 'HEALTH': '-1.935', 'AMMO4': '-0.047', 'AMMO2': '-0.009', 'AMMO5': '0.013', 'weapon5': '0.072', 'WEAPON4': '0.100', 'AMMO3': '0.106', 'weapon4': '0.112', 'WEAPON5': '0.200', 'HITCOUNT': '0.210', 'WEAPON3': '0.600', 'DAMAGECOUNT': '0.798', 'weapon3': '1.390', 'weapon2': '1.550', 'FRAGCOUNT': '2.000'} [2024-08-05 07:55:40,483][00034] Fps is (10 sec: 1638.4, 60 sec: 1092.3, 300 sec: 1138.5). Total num frames: 9035776. Throughput: 0: 283.1. Samples: 2259490. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 07:55:40,484][00034] Avg episode reward: [(0, '-2.911')] [2024-08-05 07:55:40,543][00139] DAMAGECOUNT value on done: 121460.0 [2024-08-05 07:55:40,544][00139] Sum rewards: -2.556, reward structure: {'DEATHCOUNT': '-10.500', 'HEALTH': '-0.296', 'AMMO4': '-0.005', 'AMMO2': '-0.001', 'weapon5': '0.008', 'AMMO5': '0.020', 'weapon4': '0.030', 'WEAPON4': '0.050', 'ARMOR': '0.104', 'AMMO3': '0.161', 'WEAPON5': '0.250', 'HITCOUNT': '0.360', 'WEAPON3': '0.900', 'DAMAGECOUNT': '1.122', 'weapon3': '1.572', 'weapon2': '1.668', 'FRAGCOUNT': '2.000'} [2024-08-05 07:55:45,483][00034] Fps is (10 sec: 1638.4, 60 sec: 1228.8, 300 sec: 1138.5). Total num frames: 9043968. Throughput: 0: 283.0. Samples: 2260349. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 07:55:45,484][00034] Avg episode reward: [(0, '-2.932')] [2024-08-05 07:55:50,483][00034] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1138.6). Total num frames: 9043968. Throughput: 0: 283.3. Samples: 2262046. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 07:55:50,486][00034] Avg episode reward: [(0, '-2.932')] [2024-08-05 07:55:55,251][00139] DAMAGECOUNT value on done: 114584.0 [2024-08-05 07:55:55,252][00139] Sum rewards: -2.207, reward structure: {'DEATHCOUNT': '-9.750', 'HEALTH': '-1.240', 'AMMO4': '-0.013', 'AMMO2': '-0.003', 'AMMO5': '0.007', 'weapon5': '0.014', 'ARMOR': '0.035', 'WEAPON5': '0.150', 'weapon7': '0.160', 'HITCOUNT': '0.170', 'AMMO3': '0.183', 'AMMO6': '0.240', 'AMMO7': '0.240', 'WEAPON7': '0.400', 'DAMAGECOUNT': '0.867', 'WEAPON3': '0.900', 'weapon2': '1.336', 'FRAGCOUNT': '2.000', 'weapon3': '2.096'} [2024-08-05 07:55:55,462][00139] DAMAGECOUNT value on done: 121804.0 [2024-08-05 07:55:55,462][00139] Sum rewards: -2.638, reward structure: {'DEATHCOUNT': '-9.750', 'HEALTH': '-1.341', 'AMMO5': '0.003', 'AMMO2': '0.023', 'weapon5': '0.064', 'WEAPON5': '0.100', 'AMMO4': '0.114', 'AMMO3': '0.142', 'HITCOUNT': '0.270', 'WEAPON3': '0.950', 'DAMAGECOUNT': '1.032', 'weapon2': '1.500', 'FRAGCOUNT': '2.000', 'weapon3': '2.256'} [2024-08-05 07:55:55,483][00034] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1138.5). Total num frames: 9052160. Throughput: 0: 282.5. Samples: 2263700. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 07:55:55,484][00034] Avg episode reward: [(0, '-2.901')] [2024-08-05 07:56:00,483][00034] Fps is (10 sec: 1638.4, 60 sec: 1228.8, 300 sec: 1138.5). Total num frames: 9060352. Throughput: 0: 282.2. Samples: 2264571. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 07:56:00,485][00034] Avg episode reward: [(0, '-2.901')] [2024-08-05 07:56:05,483][00034] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1110.8). Total num frames: 9060352. Throughput: 0: 283.2. Samples: 2266320. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 07:56:05,484][00034] Avg episode reward: [(0, '-2.901')] [2024-08-05 07:56:09,784][00139] DAMAGECOUNT value on done: 114664.0 [2024-08-05 07:56:10,015][00139] DAMAGECOUNT value on done: 122277.0 [2024-08-05 07:56:10,016][00139] Sum rewards: -1.282, reward structure: {'DEATHCOUNT': '-12.750', 'HEALTH': '-0.617', 'AMMO2': '0.012', 'AMMO5': '0.020', 'weapon5': '0.026', 'AMMO4': '0.060', 'ARMOR': '0.080', 'WEAPON5': '0.200', 'AMMO3': '0.212', 'HITCOUNT': '0.370', 'WEAPON3': '0.950', 'DAMAGECOUNT': '1.419', 'weapon2': '1.830', 'weapon3': '1.906', 'FRAGCOUNT': '5.000'} [2024-08-05 07:56:10,483][00034] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1138.5). Total num frames: 9068544. Throughput: 0: 283.8. Samples: 2268028. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 07:56:10,484][00034] Avg episode reward: [(0, '-2.888')] [2024-08-05 07:56:15,483][00034] Fps is (10 sec: 1638.4, 60 sec: 1228.8, 300 sec: 1138.5). Total num frames: 9076736. Throughput: 0: 283.1. Samples: 2268850. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 07:56:15,485][00034] Avg episode reward: [(0, '-2.888')] [2024-08-05 07:56:20,483][00034] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1110.8). Total num frames: 9076736. Throughput: 0: 283.5. Samples: 2270553. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 07:56:20,484][00034] Avg episode reward: [(0, '-2.888')] [2024-08-05 07:56:24,967][00139] DAMAGECOUNT value on done: 115090.0 [2024-08-05 07:56:24,967][00139] Sum rewards: -4.038, reward structure: {'DEATHCOUNT': '-11.250', 'HEALTH': '-3.070', 'AMMO4': '-0.028', 'AMMO2': '-0.005', 'AMMO5': '0.010', 'ARMOR': '0.036', 'weapon5': '0.138', 'AMMO3': '0.151', 'AMMO6': '0.200', 'WEAPON7': '0.200', 'AMMO7': '0.200', 'WEAPON5': '0.250', 'HITCOUNT': '0.260', 'WEAPON3': '0.900', 'DAMAGECOUNT': '1.278', 'weapon3': '1.680', 'weapon2': '2.012', 'FRAGCOUNT': '3.000'} [2024-08-05 07:56:25,268][00139] DAMAGECOUNT value on done: 122460.0 [2024-08-05 07:56:25,269][00139] Sum rewards: -0.364, reward structure: {'DEATHCOUNT': '-5.250', 'HEALTH': '-1.080', 'AMMO4': '-0.049', 'AMMO2': '-0.010', 'ARMOR': '0.040', 'weapon7': '0.102', 'AMMO3': '0.119', 'AMMO6': '0.120', 'AMMO7': '0.120', 'HITCOUNT': '0.180', 'WEAPON7': '0.200', 'DAMAGECOUNT': '0.549', 'WEAPON3': '0.550', 'FRAGCOUNT': '1.000', 'weapon3': '1.364', 'weapon2': '1.680'} [2024-08-05 07:56:25,483][00034] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1138.5). Total num frames: 9084928. Throughput: 0: 281.7. Samples: 2272167. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 07:56:25,485][00034] Avg episode reward: [(0, '-2.857')] [2024-08-05 07:56:29,381][00138] Updated weights for policy 0, policy_version 1110 (0.0017) [2024-08-05 07:56:30,483][00034] Fps is (10 sec: 1638.4, 60 sec: 1228.8, 300 sec: 1138.5). Total num frames: 9093120. Throughput: 0: 280.3. Samples: 2272961. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 07:56:30,484][00034] Avg episode reward: [(0, '-2.857')] [2024-08-05 07:56:35,483][00034] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1110.8). Total num frames: 9093120. Throughput: 0: 279.9. Samples: 2274641. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 07:56:35,484][00034] Avg episode reward: [(0, '-2.857')] [2024-08-05 07:56:40,087][00139] DAMAGECOUNT value on done: 115400.0 [2024-08-05 07:56:40,087][00139] Sum rewards: 2.509, reward structure: {'DEATHCOUNT': '-6.750', 'AMMO5': '0.005', 'AMMO2': '0.021', 'WEAPON4': '0.050', 'ARMOR': '0.068', 'AMMO3': '0.101', 'AMMO4': '0.104', 'weapon4': '0.210', 'HITCOUNT': '0.280', 'HEALTH': '0.360', 'WEAPON3': '0.650', 'DAMAGECOUNT': '0.930', 'weapon2': '1.358', 'weapon3': '2.122', 'FRAGCOUNT': '3.000'} [2024-08-05 07:56:40,321][00139] DAMAGECOUNT value on done: 122504.0 [2024-08-05 07:56:40,483][00034] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1138.5). Total num frames: 9101312. Throughput: 0: 281.0. Samples: 2276344. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 07:56:40,484][00034] Avg episode reward: [(0, '-2.827')] [2024-08-05 07:56:45,483][00034] Fps is (10 sec: 1638.4, 60 sec: 1092.3, 300 sec: 1138.5). Total num frames: 9109504. Throughput: 0: 280.3. Samples: 2277184. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 07:56:45,484][00034] Avg episode reward: [(0, '-2.827')] [2024-08-05 07:56:50,483][00034] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1110.8). Total num frames: 9109504. Throughput: 0: 278.8. Samples: 2278864. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 07:56:50,484][00034] Avg episode reward: [(0, '-2.827')] [2024-08-05 07:56:55,102][00139] DAMAGECOUNT value on done: 115645.0 [2024-08-05 07:56:55,103][00139] Sum rewards: -3.407, reward structure: {'DEATHCOUNT': '-9.750', 'HEALTH': '-0.622', 'AMMO5': '0.003', 'AMMO2': '0.009', 'AMMO4': '0.043', 'weapon5': '0.064', 'weapon4': '0.080', 'WEAPON5': '0.100', 'ARMOR': '0.104', 'WEAPON4': '0.150', 'AMMO3': '0.167', 'HITCOUNT': '0.200', 'DAMAGECOUNT': '0.735', 'WEAPON3': '0.900', 'FRAGCOUNT': '1.000', 'weapon2': '1.304', 'weapon3': '2.106'} [2024-08-05 07:56:55,346][00139] DAMAGECOUNT value on done: 123099.0 [2024-08-05 07:56:55,346][00139] Sum rewards: -3.219, reward structure: {'DEATHCOUNT': '-11.250', 'HEALTH': '-2.284', 'AMMO2': '0.008', 'AMMO5': '0.013', 'AMMO4': '0.041', 'weapon5': '0.062', 'ARMOR': '0.088', 'WEAPON4': '0.100', 'AMMO3': '0.145', 'weapon4': '0.246', 'WEAPON5': '0.250', 'HITCOUNT': '0.330', 'WEAPON3': '1.000', 'weapon2': '1.356', 'DAMAGECOUNT': '1.782', 'weapon3': '1.894', 'FRAGCOUNT': '3.000'} [2024-08-05 07:56:55,483][00034] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1138.5). Total num frames: 9117696. Throughput: 0: 278.1. Samples: 2280542. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 07:56:55,484][00034] Avg episode reward: [(0, '-2.885')] [2024-08-05 07:57:00,483][00034] Fps is (10 sec: 1638.4, 60 sec: 1092.3, 300 sec: 1138.5). Total num frames: 9125888. Throughput: 0: 277.6. Samples: 2281342. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 07:57:00,484][00034] Avg episode reward: [(0, '-2.885')] [2024-08-05 07:57:05,483][00034] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1110.8). Total num frames: 9125888. Throughput: 0: 276.6. Samples: 2282999. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 07:57:05,484][00034] Avg episode reward: [(0, '-2.885')] [2024-08-05 07:57:10,349][00139] DAMAGECOUNT value on done: 115810.0 [2024-08-05 07:57:10,349][00139] Sum rewards: -6.263, reward structure: {'DEATHCOUNT': '-10.500', 'HEALTH': '-1.910', 'AMMO4': '-0.060', 'AMMO2': '-0.012', 'WEAPON4': '0.050', 'weapon4': '0.050', 'ARMOR': '0.064', 'HITCOUNT': '0.120', 'AMMO3': '0.159', 'DAMAGECOUNT': '0.495', 'WEAPON3': '0.850', 'FRAGCOUNT': '1.000', 'weapon3': '1.708', 'weapon2': '1.722'} [2024-08-05 07:57:10,483][00034] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1110.8). Total num frames: 9134080. Throughput: 0: 277.9. Samples: 2284673. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 07:57:10,484][00034] Avg episode reward: [(0, '-2.875')] [2024-08-05 07:57:10,606][00139] DAMAGECOUNT value on done: 123684.0 [2024-08-05 07:57:10,607][00139] Sum rewards: 3.717, reward structure: {'DEATHCOUNT': '-7.500', 'HEALTH': '-1.510', 'AMMO2': '0.003', 'AMMO5': '0.005', 'AMMO4': '0.014', 'weapon5': '0.104', 'AMMO3': '0.154', 'WEAPON5': '0.200', 'HITCOUNT': '0.420', 'ARMOR': '0.520', 'WEAPON3': '0.900', 'weapon2': '1.216', 'DAMAGECOUNT': '1.755', 'weapon3': '2.436', 'FRAGCOUNT': '5.000'} [2024-08-05 07:57:15,483][00034] Fps is (10 sec: 1638.4, 60 sec: 1092.3, 300 sec: 1138.5). Total num frames: 9142272. Throughput: 0: 279.6. Samples: 2285545. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 07:57:15,484][00034] Avg episode reward: [(0, '-2.829')] [2024-08-05 07:57:20,483][00034] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1110.8). Total num frames: 9142272. Throughput: 0: 280.0. Samples: 2287242. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 07:57:20,484][00034] Avg episode reward: [(0, '-2.829')] [2024-08-05 07:57:25,190][00139] DAMAGECOUNT value on done: 116040.0 [2024-08-05 07:57:25,191][00139] Sum rewards: -4.851, reward structure: {'DEATHCOUNT': '-10.500', 'HEALTH': '-2.732', 'AMMO4': '-0.029', 'AMMO2': '-0.006', 'AMMO5': '0.005', 'WEAPON4': '0.050', 'WEAPON5': '0.050', 'weapon4': '0.054', 'weapon5': '0.060', 'HITCOUNT': '0.110', 'AMMO3': '0.172', 'ARMOR': '0.519', 'DAMAGECOUNT': '0.690', 'WEAPON3': '0.950', 'weapon2': '1.706', 'FRAGCOUNT': '2.000', 'weapon3': '2.050'} [2024-08-05 07:57:25,434][00139] DAMAGECOUNT value on done: 124053.0 [2024-08-05 07:57:25,435][00139] Sum rewards: -0.540, reward structure: {'DEATHCOUNT': '-9.750', 'HEALTH': '-0.757', 'AMMO5': '0.020', 'AMMO2': '0.022', 'weapon5': '0.060', 'AMMO4': '0.109', 'AMMO3': '0.157', 'HITCOUNT': '0.290', 'WEAPON5': '0.300', 'ARMOR': '0.400', 'WEAPON3': '0.900', 'DAMAGECOUNT': '1.107', 'weapon2': '1.600', 'weapon3': '2.002', 'FRAGCOUNT': '3.000'} [2024-08-05 07:57:25,483][00034] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1110.8). Total num frames: 9150464. Throughput: 0: 279.4. Samples: 2288919. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 07:57:25,484][00034] Avg episode reward: [(0, '-2.858')] [2024-08-05 07:57:30,483][00034] Fps is (10 sec: 1638.4, 60 sec: 1092.3, 300 sec: 1138.5). Total num frames: 9158656. Throughput: 0: 279.2. Samples: 2289747. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 07:57:30,485][00034] Avg episode reward: [(0, '-2.858')] [2024-08-05 07:57:35,483][00034] Fps is (10 sec: 1638.4, 60 sec: 1228.8, 300 sec: 1138.5). Total num frames: 9166848. Throughput: 0: 278.2. Samples: 2291384. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 07:57:35,484][00034] Avg episode reward: [(0, '-2.858')] [2024-08-05 07:57:35,492][00132] Saving /kaggle/working/train_dir/default_experiment/checkpoint_p0/checkpoint_000001119_9166848.pth... [2024-08-05 07:57:35,563][00132] Removing /kaggle/working/train_dir/default_experiment/checkpoint_p0/checkpoint_000001086_8896512.pth [2024-08-05 07:57:40,439][00139] DAMAGECOUNT value on done: 116460.0 [2024-08-05 07:57:40,439][00139] Sum rewards: -2.720, reward structure: {'DEATHCOUNT': '-9.000', 'HEALTH': '-1.250', 'AMMO4': '-0.004', 'AMMO2': '-0.001', 'AMMO5': '0.009', 'weapon5': '0.024', 'ARMOR': '0.052', 'WEAPON4': '0.100', 'AMMO3': '0.139', 'WEAPON5': '0.200', 'HITCOUNT': '0.330', 'WEAPON3': '0.800', 'FRAGCOUNT': '1.000', 'DAMAGECOUNT': '1.260', 'weapon2': '1.438', 'weapon3': '2.182'} [2024-08-05 07:57:40,483][00034] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1110.8). Total num frames: 9166848. Throughput: 0: 278.0. Samples: 2293052. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 07:57:40,484][00034] Avg episode reward: [(0, '-2.830')] [2024-08-05 07:57:40,690][00139] DAMAGECOUNT value on done: 124553.0 [2024-08-05 07:57:40,691][00139] Sum rewards: -1.970, reward structure: {'DEATHCOUNT': '-12.000', 'HEALTH': '-1.480', 'AMMO5': '0.003', 'AMMO2': '0.006', 'weapon5': '0.008', 'AMMO4': '0.032', 'WEAPON5': '0.050', 'AMMO3': '0.207', 'ARMOR': '0.428', 'HITCOUNT': '0.430', 'WEAPON3': '1.000', 'DAMAGECOUNT': '1.500', 'weapon3': '1.602', 'weapon2': '2.244', 'FRAGCOUNT': '4.000'} [2024-08-05 07:57:42,789][00138] Updated weights for policy 0, policy_version 1120 (0.0017) [2024-08-05 07:57:45,483][00034] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1138.5). Total num frames: 9175040. Throughput: 0: 278.4. Samples: 2293871. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 07:57:45,485][00034] Avg episode reward: [(0, '-2.741')] [2024-08-05 07:57:50,483][00034] Fps is (10 sec: 1638.4, 60 sec: 1228.8, 300 sec: 1138.6). Total num frames: 9183232. Throughput: 0: 279.5. Samples: 2295575. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 07:57:50,484][00034] Avg episode reward: [(0, '-2.741')] [2024-08-05 07:57:55,376][00139] DAMAGECOUNT value on done: 116769.0 [2024-08-05 07:57:55,377][00139] Sum rewards: 1.743, reward structure: {'DEATHCOUNT': '-6.750', 'HEALTH': '-1.170', 'AMMO2': '0.013', 'AMMO5': '0.014', 'AMMO4': '0.064', 'weapon5': '0.122', 'AMMO3': '0.141', 'HITCOUNT': '0.250', 'WEAPON5': '0.250', 'ARMOR': '0.463', 'WEAPON3': '0.750', 'DAMAGECOUNT': '0.927', 'weapon2': '1.460', 'weapon3': '2.208', 'FRAGCOUNT': '3.000'} [2024-08-05 07:57:55,483][00034] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1110.8). Total num frames: 9183232. Throughput: 0: 279.9. Samples: 2297268. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 07:57:55,485][00034] Avg episode reward: [(0, '-2.746')] [2024-08-05 07:57:55,604][00139] DAMAGECOUNT value on done: 124986.0 [2024-08-05 07:57:55,605][00139] Sum rewards: 0.876, reward structure: {'DEATHCOUNT': '-8.250', 'HEALTH': '-1.208', 'AMMO4': '-0.009', 'AMMO2': '-0.002', 'AMMO5': '0.005', 'weapon5': '0.056', 'WEAPON5': '0.100', 'AMMO6': '0.100', 'WEAPON7': '0.100', 'AMMO7': '0.100', 'AMMO3': '0.124', 'HITCOUNT': '0.150', 'ARMOR': '0.510', 'WEAPON3': '0.800', 'DAMAGECOUNT': '1.299', 'weapon2': '1.588', 'weapon3': '1.912', 'FRAGCOUNT': '3.500'} [2024-08-05 07:58:00,483][00034] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1138.5). Total num frames: 9191424. Throughput: 0: 279.0. Samples: 2298099. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 07:58:00,486][00034] Avg episode reward: [(0, '-2.675')] [2024-08-05 07:58:05,483][00034] Fps is (10 sec: 1638.4, 60 sec: 1228.8, 300 sec: 1138.5). Total num frames: 9199616. Throughput: 0: 278.4. Samples: 2299771. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 07:58:05,485][00034] Avg episode reward: [(0, '-2.675')] [2024-08-05 07:58:10,377][00139] DAMAGECOUNT value on done: 117146.0 [2024-08-05 07:58:10,377][00139] Sum rewards: 1.565, reward structure: {'DEATHCOUNT': '-5.250', 'HEALTH': '-0.684', 'AMMO5': '0.005', 'AMMO2': '0.006', 'AMMO4': '0.028', 'ARMOR': '0.028', 'weapon5': '0.048', 'weapon7': '0.058', 'AMMO3': '0.096', 'AMMO6': '0.100', 'AMMO7': '0.100', 'WEAPON5': '0.100', 'WEAPON7': '0.100', 'HITCOUNT': '0.240', 'WEAPON3': '0.600', 'weapon3': '1.094', 'DAMAGECOUNT': '1.131', 'weapon2': '1.766', 'FRAGCOUNT': '2.000'} [2024-08-05 07:58:10,483][00034] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1110.8). Total num frames: 9199616. Throughput: 0: 278.9. Samples: 2301468. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 07:58:10,484][00034] Avg episode reward: [(0, '-2.641')] [2024-08-05 07:58:10,610][00139] DAMAGECOUNT value on done: 125536.0 [2024-08-05 07:58:10,611][00139] Sum rewards: -3.964, reward structure: {'DEATHCOUNT': '-12.000', 'HEALTH': '-2.210', 'AMMO2': '0.005', 'AMMO5': '0.013', 'WEAPON1': '0.020', 'AMMO4': '0.024', 'AMMO3': '0.146', 'weapon5': '0.162', 'HITCOUNT': '0.350', 'WEAPON5': '0.350', 'WEAPON3': '0.950', 'DAMAGECOUNT': '1.650', 'weapon3': '1.658', 'weapon2': '1.918', 'FRAGCOUNT': '3.000'} [2024-08-05 07:58:15,483][00034] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1110.8). Total num frames: 9207808. Throughput: 0: 279.2. Samples: 2302310. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 07:58:15,485][00034] Avg episode reward: [(0, '-2.654')] [2024-08-05 07:58:20,483][00034] Fps is (10 sec: 1638.4, 60 sec: 1228.8, 300 sec: 1138.5). Total num frames: 9216000. Throughput: 0: 280.9. Samples: 2304026. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 07:58:20,484][00034] Avg episode reward: [(0, '-2.654')] [2024-08-05 07:58:24,939][00139] DAMAGECOUNT value on done: 117501.0 [2024-08-05 07:58:24,940][00139] Sum rewards: -6.821, reward structure: {'DEATHCOUNT': '-11.250', 'HEALTH': '-2.458', 'AMMO4': '-0.024', 'AMMO2': '-0.005', 'AMMO5': '0.007', 'weapon5': '0.056', 'ARMOR': '0.088', 'WEAPON5': '0.150', 'AMMO3': '0.174', 'HITCOUNT': '0.310', 'FRAGCOUNT': '0.500', 'WEAPON3': '1.000', 'DAMAGECOUNT': '1.065', 'weapon2': '1.700', 'weapon3': '1.866'} [2024-08-05 07:58:25,151][00139] DAMAGECOUNT value on done: 125891.0 [2024-08-05 07:58:25,152][00139] Sum rewards: 2.107, reward structure: {'DEATHCOUNT': '-7.500', 'HEALTH': '-0.907', 'weapon5': '0.006', 'AMMO2': '0.018', 'AMMO5': '0.020', 'ARMOR': '0.028', 'AMMO4': '0.091', 'WEAPON4': '0.100', 'AMMO3': '0.140', 'WEAPON5': '0.250', 'HITCOUNT': '0.320', 'WEAPON3': '0.750', 'DAMAGECOUNT': '1.065', 'weapon2': '1.728', 'weapon3': '1.998', 'FRAGCOUNT': '4.000'} [2024-08-05 07:58:25,483][00034] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1110.8). Total num frames: 9216000. Throughput: 0: 283.1. Samples: 2305793. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 07:58:25,484][00034] Avg episode reward: [(0, '-2.639')] [2024-08-05 07:58:30,483][00034] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1110.8). Total num frames: 9224192. Throughput: 0: 283.4. Samples: 2306625. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 07:58:30,485][00034] Avg episode reward: [(0, '-2.639')] [2024-08-05 07:58:35,483][00034] Fps is (10 sec: 1638.4, 60 sec: 1092.3, 300 sec: 1138.5). Total num frames: 9232384. Throughput: 0: 281.9. Samples: 2308262. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 07:58:35,484][00034] Avg episode reward: [(0, '-2.639')] [2024-08-05 07:58:39,788][00139] DAMAGECOUNT value on done: 118386.0 [2024-08-05 07:58:39,788][00139] Sum rewards: 3.122, reward structure: {'DEATHCOUNT': '-12.000', 'HEALTH': '-0.080', 'AMMO2': '0.012', 'AMMO5': '0.017', 'WEAPON4': '0.050', 'AMMO4': '0.062', 'ARMOR': '0.064', 'weapon5': '0.118', 'AMMO3': '0.129', 'WEAPON5': '0.300', 'HITCOUNT': '0.470', 'WEAPON3': '0.700', 'weapon2': '1.580', 'weapon3': '2.044', 'DAMAGECOUNT': '2.655', 'FRAGCOUNT': '7.000'} [2024-08-05 07:58:40,018][00139] DAMAGECOUNT value on done: 126001.0 [2024-08-05 07:58:40,018][00139] Sum rewards: -4.766, reward structure: {'DEATHCOUNT': '-9.750', 'HEALTH': '-1.787', 'AMMO2': '0.002', 'AMMO4': '0.008', 'WEAPON1': '0.010', 'AMMO5': '0.015', 'weapon5': '0.078', 'HITCOUNT': '0.100', 'AMMO3': '0.137', 'WEAPON5': '0.250', 'DAMAGECOUNT': '0.330', 'ARMOR': '0.483', 'WEAPON3': '0.800', 'FRAGCOUNT': '1.000', 'weapon3': '1.738', 'weapon2': '1.820'} [2024-08-05 07:58:40,483][00034] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1110.8). Total num frames: 9232384. Throughput: 0: 283.6. Samples: 2310028. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 07:58:40,484][00034] Avg episode reward: [(0, '-2.630')] [2024-08-05 07:58:45,483][00034] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1110.8). Total num frames: 9240576. Throughput: 0: 283.7. Samples: 2310865. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 07:58:45,484][00034] Avg episode reward: [(0, '-2.630')] [2024-08-05 07:58:50,483][00034] Fps is (10 sec: 1638.4, 60 sec: 1092.3, 300 sec: 1138.5). Total num frames: 9248768. Throughput: 0: 284.1. Samples: 2312554. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 07:58:50,486][00034] Avg episode reward: [(0, '-2.630')] [2024-08-05 07:58:54,585][00139] DAMAGECOUNT value on done: 118492.0 [2024-08-05 07:58:54,793][00139] DAMAGECOUNT value on done: 126451.0 [2024-08-05 07:58:54,793][00139] Sum rewards: -2.908, reward structure: {'DEATHCOUNT': '-10.500', 'HEALTH': '-1.856', 'AMMO2': '0.005', 'AMMO4': '0.026', 'weapon4': '0.076', 'WEAPON4': '0.100', 'AMMO3': '0.210', 'HITCOUNT': '0.380', 'WEAPON3': '1.000', 'DAMAGECOUNT': '1.350', 'weapon3': '1.648', 'weapon2': '1.652', 'FRAGCOUNT': '3.000'} [2024-08-05 07:58:55,032][00138] Updated weights for policy 0, policy_version 1130 (0.0018) [2024-08-05 07:58:55,483][00034] Fps is (10 sec: 1638.4, 60 sec: 1228.8, 300 sec: 1138.5). Total num frames: 9256960. Throughput: 0: 284.6. Samples: 2314276. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 07:58:55,484][00034] Avg episode reward: [(0, '-2.613')] [2024-08-05 07:59:00,483][00034] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1110.8). Total num frames: 9256960. Throughput: 0: 285.4. Samples: 2315154. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 07:59:00,484][00034] Avg episode reward: [(0, '-2.613')] [2024-08-05 07:59:05,483][00034] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1138.5). Total num frames: 9265152. Throughput: 0: 283.7. Samples: 2316792. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 07:59:05,485][00034] Avg episode reward: [(0, '-2.613')] [2024-08-05 07:59:09,549][00139] DAMAGECOUNT value on done: 118746.0 [2024-08-05 07:59:09,549][00139] Sum rewards: -2.440, reward structure: {'DEATHCOUNT': '-11.250', 'HEALTH': '-1.235', 'AMMO2': '0.018', 'ARMOR': '0.056', 'AMMO4': '0.088', 'AMMO3': '0.151', 'WEAPON4': '0.200', 'HITCOUNT': '0.230', 'weapon4': '0.420', 'WEAPON3': '0.750', 'DAMAGECOUNT': '0.762', 'weapon3': '1.584', 'weapon2': '1.786', 'FRAGCOUNT': '4.000'} [2024-08-05 07:59:09,850][00139] DAMAGECOUNT value on done: 126732.0 [2024-08-05 07:59:09,850][00139] Sum rewards: -5.591, reward structure: {'DEATHCOUNT': '-10.500', 'HEALTH': '-0.626', 'FRAGCOUNT': '-0.500', 'AMMO5': '0.007', 'AMMO2': '0.017', 'WEAPON5': '0.050', 'WEAPON4': '0.050', 'weapon4': '0.050', 'AMMO4': '0.086', 'weapon5': '0.102', 'AMMO3': '0.117', 'HITCOUNT': '0.140', 'ARMOR': '0.436', 'WEAPON3': '0.600', 'DAMAGECOUNT': '0.843', 'weapon3': '1.186', 'weapon2': '2.350'} [2024-08-05 07:59:10,483][00034] Fps is (10 sec: 1638.4, 60 sec: 1228.8, 300 sec: 1138.5). Total num frames: 9273344. Throughput: 0: 281.8. Samples: 2318473. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 07:59:10,485][00034] Avg episode reward: [(0, '-2.591')] [2024-08-05 07:59:15,483][00034] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1110.8). Total num frames: 9273344. Throughput: 0: 282.7. Samples: 2319346. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 07:59:15,485][00034] Avg episode reward: [(0, '-2.591')] [2024-08-05 07:59:20,483][00034] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1138.5). Total num frames: 9281536. Throughput: 0: 284.4. Samples: 2321060. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 07:59:20,484][00034] Avg episode reward: [(0, '-2.591')] [2024-08-05 07:59:24,534][00139] DAMAGECOUNT value on done: 119154.0 [2024-08-05 07:59:24,534][00139] Sum rewards: -0.726, reward structure: {'DEATHCOUNT': '-9.000', 'HEALTH': '-0.694', 'AMMO2': '0.005', 'AMMO5': '0.015', 'AMMO4': '0.022', 'weapon7': '0.080', 'ARMOR': '0.092', 'WEAPON4': '0.100', 'AMMO6': '0.120', 'AMMO7': '0.120', 'AMMO3': '0.128', 'weapon5': '0.172', 'WEAPON7': '0.200', 'weapon4': '0.226', 'HITCOUNT': '0.230', 'WEAPON5': '0.350', 'WEAPON3': '0.500', 'weapon3': '1.012', 'DAMAGECOUNT': '1.224', 'weapon2': '1.372', 'FRAGCOUNT': '3.000'} [2024-08-05 07:59:24,787][00139] DAMAGECOUNT value on done: 127275.0 [2024-08-05 07:59:24,787][00139] Sum rewards: -0.443, reward structure: {'DEATHCOUNT': '-11.250', 'HEALTH': '-1.302', 'AMMO5': '0.006', 'AMMO2': '0.013', 'weapon4': '0.020', 'ARMOR': '0.040', 'WEAPON1': '0.040', 'AMMO4': '0.064', 'WEAPON4': '0.100', 'weapon5': '0.104', 'AMMO3': '0.150', 'WEAPON5': '0.200', 'HITCOUNT': '0.410', 'WEAPON3': '0.850', 'weapon2': '1.540', 'DAMAGECOUNT': '1.629', 'weapon3': '1.942', 'FRAGCOUNT': '5.000'} [2024-08-05 07:59:25,483][00034] Fps is (10 sec: 1638.4, 60 sec: 1228.8, 300 sec: 1138.5). Total num frames: 9289728. Throughput: 0: 281.4. Samples: 2322691. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 07:59:25,484][00034] Avg episode reward: [(0, '-2.564')] [2024-08-05 07:59:30,483][00034] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1110.8). Total num frames: 9289728. Throughput: 0: 281.8. Samples: 2323547. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 07:59:30,485][00034] Avg episode reward: [(0, '-2.564')] [2024-08-05 07:59:35,483][00034] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1110.8). Total num frames: 9297920. Throughput: 0: 282.4. Samples: 2325260. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 07:59:35,485][00034] Avg episode reward: [(0, '-2.564')] [2024-08-05 07:59:35,492][00132] Saving /kaggle/working/train_dir/default_experiment/checkpoint_p0/checkpoint_000001135_9297920.pth... [2024-08-05 07:59:35,568][00132] Removing /kaggle/working/train_dir/default_experiment/checkpoint_p0/checkpoint_000001102_9027584.pth [2024-08-05 07:59:39,521][00139] DAMAGECOUNT value on done: 119509.0 [2024-08-05 07:59:39,521][00139] Sum rewards: 1.652, reward structure: {'DEATHCOUNT': '-7.500', 'HEALTH': '-0.370', 'AMMO5': '0.003', 'AMMO2': '0.010', 'WEAPON5': '0.050', 'AMMO4': '0.051', 'weapon4': '0.064', 'weapon7': '0.070', 'ARMOR': '0.072', 'AMMO6': '0.100', 'AMMO7': '0.100', 'WEAPON4': '0.100', 'WEAPON7': '0.100', 'AMMO3': '0.127', 'HITCOUNT': '0.200', 'WEAPON3': '0.600', 'DAMAGECOUNT': '1.065', 'weapon2': '1.226', 'weapon3': '1.584', 'FRAGCOUNT': '4.000'} [2024-08-05 07:59:39,753][00139] DAMAGECOUNT value on done: 127659.0 [2024-08-05 07:59:39,754][00139] Sum rewards: 3.904, reward structure: {'DEATHCOUNT': '-4.500', 'HEALTH': '-0.511', 'weapon4': '0.002', 'AMMO5': '0.005', 'AMMO2': '0.006', 'AMMO4': '0.031', 'ARMOR': '0.040', 'weapon5': '0.054', 'weapon7': '0.070', 'AMMO3': '0.094', 'WEAPON4': '0.100', 'WEAPON5': '0.100', 'AMMO6': '0.120', 'AMMO7': '0.120', 'WEAPON7': '0.200', 'HITCOUNT': '0.290', 'WEAPON3': '0.500', 'DAMAGECOUNT': '1.152', 'weapon3': '1.236', 'weapon2': '1.794', 'FRAGCOUNT': '3.000'} [2024-08-05 07:59:40,483][00034] Fps is (10 sec: 1638.4, 60 sec: 1228.8, 300 sec: 1138.5). Total num frames: 9306112. Throughput: 0: 280.7. Samples: 2326908. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 07:59:40,484][00034] Avg episode reward: [(0, '-2.448')] [2024-08-05 07:59:45,483][00034] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1110.8). Total num frames: 9306112. Throughput: 0: 280.0. Samples: 2327755. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 07:59:45,484][00034] Avg episode reward: [(0, '-2.448')] [2024-08-05 07:59:50,483][00034] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1110.8). Total num frames: 9314304. Throughput: 0: 280.6. Samples: 2329419. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 07:59:50,484][00034] Avg episode reward: [(0, '-2.448')] [2024-08-05 07:59:54,327][00139] DAMAGECOUNT value on done: 119653.0 [2024-08-05 07:59:54,327][00139] Sum rewards: 3.661, reward structure: {'DEATHCOUNT': '-4.500', 'AMMO2': '0.014', 'ARMOR': '0.032', 'weapon4': '0.034', 'WEAPON4': '0.050', 'AMMO4': '0.067', 'AMMO3': '0.090', 'HITCOUNT': '0.140', 'WEAPON3': '0.350', 'DAMAGECOUNT': '0.432', 'HEALTH': '0.774', 'weapon3': '1.396', 'weapon2': '1.782', 'FRAGCOUNT': '3.000'} [2024-08-05 07:59:54,545][00139] DAMAGECOUNT value on done: 128001.0 [2024-08-05 07:59:54,546][00139] Sum rewards: 0.502, reward structure: {'DEATHCOUNT': '-6.750', 'HEALTH': '-0.782', 'AMMO2': '0.000', 'AMMO4': '0.001', 'AMMO5': '0.003', 'weapon5': '0.044', 'WEAPON5': '0.050', 'WEAPON4': '0.050', 'AMMO3': '0.104', 'weapon4': '0.142', 'ARMOR': '0.148', 'HITCOUNT': '0.250', 'WEAPON3': '0.650', 'DAMAGECOUNT': '1.026', 'weapon2': '1.698', 'weapon3': '1.868', 'FRAGCOUNT': '2.000'} [2024-08-05 07:59:55,483][00034] Fps is (10 sec: 1638.4, 60 sec: 1092.3, 300 sec: 1138.5). Total num frames: 9322496. Throughput: 0: 282.2. Samples: 2331173. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 07:59:55,484][00034] Avg episode reward: [(0, '-2.412')] [2024-08-05 08:00:00,483][00034] Fps is (10 sec: 1638.4, 60 sec: 1228.8, 300 sec: 1138.5). Total num frames: 9330688. Throughput: 0: 282.8. Samples: 2332071. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 08:00:00,484][00034] Avg episode reward: [(0, '-2.412')] [2024-08-05 08:00:05,483][00034] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1110.8). Total num frames: 9330688. Throughput: 0: 281.9. Samples: 2333745. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 08:00:05,484][00034] Avg episode reward: [(0, '-2.412')] [2024-08-05 08:00:07,842][00138] Updated weights for policy 0, policy_version 1140 (0.0018) [2024-08-05 08:00:09,340][00139] DAMAGECOUNT value on done: 119768.0 [2024-08-05 08:00:09,341][00139] Sum rewards: -4.252, reward structure: {'DEATHCOUNT': '-7.500', 'FRAGCOUNT': '-1.500', 'AMMO5': '0.005', 'AMMO2': '0.007', 'weapon5': '0.010', 'AMMO4': '0.036', 'ARMOR': '0.040', 'HITCOUNT': '0.070', 'AMMO3': '0.090', 'AMMO6': '0.100', 'AMMO7': '0.100', 'WEAPON7': '0.100', 'WEAPON5': '0.100', 'weapon7': '0.100', 'HEALTH': '0.219', 'DAMAGECOUNT': '0.345', 'WEAPON3': '0.450', 'weapon3': '1.326', 'weapon2': '1.650'} [2024-08-05 08:00:09,576][00139] DAMAGECOUNT value on done: 128162.0 [2024-08-05 08:00:09,576][00139] Sum rewards: -2.729, reward structure: {'DEATHCOUNT': '-7.500', 'HEALTH': '-1.612', 'weapon5': '0.008', 'AMMO2': '0.009', 'AMMO5': '0.010', 'AMMO4': '0.044', 'WEAPON5': '0.100', 'AMMO3': '0.123', 'HITCOUNT': '0.150', 'ARMOR': '0.428', 'DAMAGECOUNT': '0.483', 'WEAPON3': '0.800', 'FRAGCOUNT': '1.000', 'weapon2': '1.160', 'weapon3': '2.068'} [2024-08-05 08:00:10,483][00034] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1138.5). Total num frames: 9338880. Throughput: 0: 281.2. Samples: 2335346. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 08:00:10,485][00034] Avg episode reward: [(0, '-2.408')] [2024-08-05 08:00:15,483][00034] Fps is (10 sec: 1638.4, 60 sec: 1228.8, 300 sec: 1138.5). Total num frames: 9347072. Throughput: 0: 281.3. Samples: 2336204. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 08:00:15,484][00034] Avg episode reward: [(0, '-2.408')] [2024-08-05 08:00:20,483][00034] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1110.8). Total num frames: 9347072. Throughput: 0: 281.4. Samples: 2337923. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 08:00:20,484][00034] Avg episode reward: [(0, '-2.408')] [2024-08-05 08:00:24,163][00139] DAMAGECOUNT value on done: 120133.0 [2024-08-05 08:00:24,164][00139] Sum rewards: -2.555, reward structure: {'DEATHCOUNT': '-9.750', 'HEALTH': '-1.655', 'AMMO2': '0.008', 'AMMO5': '0.010', 'WEAPON1': '0.020', 'ARMOR': '0.028', 'AMMO4': '0.038', 'weapon5': '0.070', 'AMMO3': '0.123', 'HITCOUNT': '0.240', 'WEAPON5': '0.250', 'WEAPON3': '0.800', 'DAMAGECOUNT': '1.095', 'weapon2': '1.326', 'weapon3': '1.842', 'FRAGCOUNT': '3.000'} [2024-08-05 08:00:24,385][00139] DAMAGECOUNT value on done: 128243.0 [2024-08-05 08:00:24,386][00139] Sum rewards: -7.900, reward structure: {'DEATHCOUNT': '-12.000', 'HEALTH': '-1.744', 'AMMO2': '0.003', 'AMMO5': '0.007', 'AMMO4': '0.013', 'ARMOR': '0.048', 'HITCOUNT': '0.050', 'WEAPON5': '0.150', 'weapon5': '0.158', 'AMMO3': '0.193', 'DAMAGECOUNT': '0.243', 'FRAGCOUNT': '0.500', 'WEAPON3': '0.800', 'weapon3': '1.774', 'weapon2': '1.904'} [2024-08-05 08:00:25,483][00034] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1138.5). Total num frames: 9355264. Throughput: 0: 282.6. Samples: 2339626. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 08:00:25,484][00034] Avg episode reward: [(0, '-2.453')] [2024-08-05 08:00:30,483][00034] Fps is (10 sec: 1638.4, 60 sec: 1228.8, 300 sec: 1138.5). Total num frames: 9363456. Throughput: 0: 282.9. Samples: 2340485. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 08:00:30,484][00034] Avg episode reward: [(0, '-2.453')] [2024-08-05 08:00:35,483][00034] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1110.8). Total num frames: 9363456. Throughput: 0: 285.0. Samples: 2342244. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 08:00:35,485][00034] Avg episode reward: [(0, '-2.453')] [2024-08-05 08:00:38,681][00139] DAMAGECOUNT value on done: 120705.0 [2024-08-05 08:00:38,682][00139] Sum rewards: 2.784, reward structure: {'DEATHCOUNT': '-7.500', 'HEALTH': '-0.773', 'AMMO5': '0.005', 'AMMO2': '0.023', 'ARMOR': '0.036', 'weapon7': '0.094', 'AMMO6': '0.100', 'AMMO7': '0.100', 'WEAPON7': '0.100', 'WEAPON5': '0.100', 'AMMO3': '0.105', 'AMMO4': '0.116', 'WEAPON4': '0.200', 'HITCOUNT': '0.330', 'weapon4': '0.456', 'WEAPON3': '0.650', 'weapon2': '1.176', 'DAMAGECOUNT': '1.716', 'weapon3': '1.750', 'FRAGCOUNT': '4.000'} [2024-08-05 08:00:38,979][00139] DAMAGECOUNT value on done: 128533.0 [2024-08-05 08:00:38,980][00139] Sum rewards: -1.198, reward structure: {'DEATHCOUNT': '-7.500', 'HEALTH': '-1.506', 'AMMO2': '0.002', 'AMMO5': '0.005', 'AMMO4': '0.009', 'weapon5': '0.026', 'ARMOR': '0.095', 'WEAPON5': '0.100', 'AMMO3': '0.149', 'HITCOUNT': '0.230', 'WEAPON3': '0.850', 'DAMAGECOUNT': '0.870', 'weapon3': '1.726', 'weapon2': '1.746', 'FRAGCOUNT': '2.000'} [2024-08-05 08:00:40,483][00034] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1110.8). Total num frames: 9371648. Throughput: 0: 283.4. Samples: 2343926. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 08:00:40,484][00034] Avg episode reward: [(0, '-2.331')] [2024-08-05 08:00:45,483][00034] Fps is (10 sec: 1638.4, 60 sec: 1228.8, 300 sec: 1138.5). Total num frames: 9379840. Throughput: 0: 282.5. Samples: 2344784. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 08:00:45,484][00034] Avg episode reward: [(0, '-2.331')] [2024-08-05 08:00:50,483][00034] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1110.8). Total num frames: 9379840. Throughput: 0: 283.7. Samples: 2346512. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 08:00:50,484][00034] Avg episode reward: [(0, '-2.331')] [2024-08-05 08:00:53,457][00139] DAMAGECOUNT value on done: 120775.0 [2024-08-05 08:00:53,693][00139] DAMAGECOUNT value on done: 129006.0 [2024-08-05 08:00:53,694][00139] Sum rewards: 1.092, reward structure: {'DEATHCOUNT': '-9.750', 'HEALTH': '-0.885', 'AMMO5': '0.007', 'AMMO2': '0.014', 'weapon5': '0.050', 'AMMO4': '0.071', 'weapon4': '0.094', 'AMMO3': '0.149', 'WEAPON5': '0.150', 'WEAPON4': '0.150', 'HITCOUNT': '0.260', 'WEAPON3': '0.850', 'weapon2': '1.356', 'DAMAGECOUNT': '1.419', 'weapon3': '2.156', 'FRAGCOUNT': '5.000'} [2024-08-05 08:00:55,483][00034] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1110.8). Total num frames: 9388032. Throughput: 0: 286.3. Samples: 2348230. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 08:00:55,485][00034] Avg episode reward: [(0, '-2.392')] [2024-08-05 08:01:00,483][00034] Fps is (10 sec: 1638.4, 60 sec: 1092.3, 300 sec: 1138.5). Total num frames: 9396224. Throughput: 0: 286.5. Samples: 2349096. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 08:01:00,485][00034] Avg episode reward: [(0, '-2.392')] [2024-08-05 08:01:05,483][00034] Fps is (10 sec: 1638.4, 60 sec: 1228.8, 300 sec: 1138.5). Total num frames: 9404416. Throughput: 0: 287.4. Samples: 2350857. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 08:01:05,484][00034] Avg episode reward: [(0, '-2.392')] [2024-08-05 08:01:07,819][00139] DAMAGECOUNT value on done: 121014.0 [2024-08-05 08:01:08,029][00139] DAMAGECOUNT value on done: 129335.0 [2024-08-05 08:01:08,030][00139] Sum rewards: -2.325, reward structure: {'DEATHCOUNT': '-8.250', 'HEALTH': '-0.692', 'AMMO2': '0.012', 'AMMO5': '0.015', 'AMMO4': '0.058', 'weapon5': '0.110', 'AMMO3': '0.129', 'WEAPON5': '0.150', 'HITCOUNT': '0.290', 'WEAPON3': '0.700', 'DAMAGECOUNT': '0.987', 'FRAGCOUNT': '1.000', 'weapon3': '1.518', 'weapon2': '1.648'} [2024-08-05 08:01:10,483][00034] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1110.8). Total num frames: 9404416. Throughput: 0: 287.2. Samples: 2352550. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 08:01:10,485][00034] Avg episode reward: [(0, '-2.365')] [2024-08-05 08:01:15,483][00034] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1138.5). Total num frames: 9412608. Throughput: 0: 286.6. Samples: 2353382. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 08:01:15,484][00034] Avg episode reward: [(0, '-2.365')] [2024-08-05 08:01:19,594][00138] Updated weights for policy 0, policy_version 1150 (0.0018) [2024-08-05 08:01:20,483][00034] Fps is (10 sec: 1638.4, 60 sec: 1228.8, 300 sec: 1138.5). Total num frames: 9420800. Throughput: 0: 285.7. Samples: 2355100. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 08:01:20,486][00034] Avg episode reward: [(0, '-2.365')] [2024-08-05 08:01:22,878][00139] DAMAGECOUNT value on done: 121375.0 [2024-08-05 08:01:22,879][00139] Sum rewards: -2.735, reward structure: {'DEATHCOUNT': '-10.500', 'HEALTH': '-0.582', 'AMMO2': '0.004', 'weapon5': '0.006', 'AMMO5': '0.010', 'AMMO4': '0.021', 'ARMOR': '0.040', 'AMMO3': '0.179', 'WEAPON5': '0.200', 'HITCOUNT': '0.330', 'WEAPON3': '0.800', 'DAMAGECOUNT': '1.083', 'weapon2': '1.558', 'FRAGCOUNT': '2.000', 'weapon3': '2.116'} [2024-08-05 08:01:23,128][00139] DAMAGECOUNT value on done: 129707.0 [2024-08-05 08:01:23,128][00139] Sum rewards: -4.132, reward structure: {'DEATHCOUNT': '-9.750', 'HEALTH': '-1.912', 'AMMO2': '0.016', 'AMMO5': '0.020', 'weapon4': '0.056', 'AMMO4': '0.079', 'WEAPON4': '0.100', 'weapon5': '0.110', 'AMMO3': '0.125', 'HITCOUNT': '0.140', 'WEAPON5': '0.300', 'FRAGCOUNT': '0.500', 'ARMOR': '0.546', 'WEAPON3': '0.900', 'DAMAGECOUNT': '1.116', 'weapon2': '1.556', 'weapon3': '1.966'} [2024-08-05 08:01:25,483][00034] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1110.8). Total num frames: 9420800. Throughput: 0: 285.3. Samples: 2356766. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 08:01:25,485][00034] Avg episode reward: [(0, '-2.465')] [2024-08-05 08:01:30,483][00034] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1138.5). Total num frames: 9428992. Throughput: 0: 284.9. Samples: 2357603. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 08:01:30,485][00034] Avg episode reward: [(0, '-2.465')] [2024-08-05 08:01:35,485][00034] Fps is (10 sec: 1638.1, 60 sec: 1228.8, 300 sec: 1138.5). Total num frames: 9437184. Throughput: 0: 284.3. Samples: 2359306. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 08:01:35,489][00034] Avg episode reward: [(0, '-2.465')] [2024-08-05 08:01:35,498][00132] Saving /kaggle/working/train_dir/default_experiment/checkpoint_p0/checkpoint_000001152_9437184.pth... [2024-08-05 08:01:35,573][00132] Removing /kaggle/working/train_dir/default_experiment/checkpoint_p0/checkpoint_000001119_9166848.pth [2024-08-05 08:01:37,805][00139] DAMAGECOUNT value on done: 121750.0 [2024-08-05 08:01:37,805][00139] Sum rewards: 0.226, reward structure: {'DEATHCOUNT': '-9.000', 'HEALTH': '-0.874', 'AMMO2': '0.001', 'AMMO4': '0.004', 'AMMO5': '0.009', 'weapon5': '0.142', 'AMMO3': '0.153', 'WEAPON5': '0.200', 'HITCOUNT': '0.270', 'WEAPON3': '0.950', 'weapon2': '0.950', 'ARMOR': '0.968', 'DAMAGECOUNT': '1.125', 'weapon3': '2.328', 'FRAGCOUNT': '3.000'} [2024-08-05 08:01:38,044][00139] DAMAGECOUNT value on done: 129962.0 [2024-08-05 08:01:38,044][00139] Sum rewards: -6.245, reward structure: {'DEATHCOUNT': '-11.250', 'HEALTH': '-2.002', 'AMMO2': '0.004', 'AMMO5': '0.005', 'AMMO4': '0.021', 'ARMOR': '0.048', 'weapon5': '0.094', 'WEAPON5': '0.150', 'AMMO3': '0.178', 'HITCOUNT': '0.210', 'DAMAGECOUNT': '0.765', 'FRAGCOUNT': '1.000', 'WEAPON3': '1.050', 'weapon2': '1.664', 'weapon3': '1.818'} [2024-08-05 08:01:40,483][00034] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1110.8). Total num frames: 9437184. Throughput: 0: 283.0. Samples: 2360966. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 08:01:40,485][00034] Avg episode reward: [(0, '-2.448')] [2024-08-05 08:01:45,483][00034] Fps is (10 sec: 819.3, 60 sec: 1092.3, 300 sec: 1138.5). Total num frames: 9445376. Throughput: 0: 278.8. Samples: 2361643. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 08:01:45,486][00034] Avg episode reward: [(0, '-2.448')] [2024-08-05 08:01:50,483][00034] Fps is (10 sec: 1638.4, 60 sec: 1228.8, 300 sec: 1138.5). Total num frames: 9453568. Throughput: 0: 276.8. Samples: 2363311. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 08:01:50,485][00034] Avg episode reward: [(0, '-2.448')] [2024-08-05 08:01:53,681][00139] DAMAGECOUNT value on done: 121851.0 [2024-08-05 08:01:53,681][00139] Sum rewards: -3.857, reward structure: {'DEATHCOUNT': '-9.750', 'HEALTH': '-1.792', 'AMMO2': '0.005', 'weapon5': '0.006', 'AMMO5': '0.010', 'AMMO4': '0.025', 'ARMOR': '0.060', 'HITCOUNT': '0.130', 'WEAPON5': '0.150', 'WEAPON4': '0.150', 'weapon4': '0.150', 'AMMO3': '0.189', 'DAMAGECOUNT': '0.303', 'WEAPON3': '0.950', 'weapon2': '1.036', 'FRAGCOUNT': '2.000', 'weapon3': '2.520'} [2024-08-05 08:01:53,923][00139] DAMAGECOUNT value on done: 130391.0 [2024-08-05 08:01:53,924][00139] Sum rewards: -6.657, reward structure: {'DEATHCOUNT': '-11.250', 'FRAGCOUNT': '-1.500', 'HEALTH': '-0.520', 'AMMO5': '0.017', 'AMMO2': '0.019', 'WEAPON1': '0.020', 'ARMOR': '0.065', 'AMMO4': '0.093', 'AMMO3': '0.094', 'weapon4': '0.098', 'WEAPON4': '0.150', 'HITCOUNT': '0.170', 'weapon5': '0.278', 'WEAPON5': '0.400', 'WEAPON3': '0.500', 'DAMAGECOUNT': '1.287', 'weapon3': '1.482', 'weapon2': '1.940'} [2024-08-05 08:01:55,483][00034] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1110.8). Total num frames: 9453568. Throughput: 0: 275.2. Samples: 2364936. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 08:01:55,484][00034] Avg episode reward: [(0, '-2.513')] [2024-08-05 08:02:00,483][00034] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1138.5). Total num frames: 9461760. Throughput: 0: 275.8. Samples: 2365795. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 08:02:00,484][00034] Avg episode reward: [(0, '-2.513')] [2024-08-05 08:02:05,483][00034] Fps is (10 sec: 1638.4, 60 sec: 1092.3, 300 sec: 1138.5). Total num frames: 9469952. Throughput: 0: 275.1. Samples: 2367478. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 08:02:05,484][00034] Avg episode reward: [(0, '-2.513')] [2024-08-05 08:02:08,661][00139] DAMAGECOUNT value on done: 122438.0 [2024-08-05 08:02:08,662][00139] Sum rewards: -6.335, reward structure: {'DEATHCOUNT': '-14.250', 'HEALTH': '-2.902', 'AMMO4': '-0.025', 'AMMO2': '-0.005', 'weapon5': '0.002', 'weapon7': '0.002', 'AMMO5': '0.018', 'WEAPON1': '0.020', 'ARMOR': '0.064', 'AMMO6': '0.100', 'WEAPON7': '0.100', 'AMMO7': '0.100', 'WEAPON5': '0.150', 'AMMO3': '0.242', 'HITCOUNT': '0.540', 'WEAPON3': '1.150', 'weapon2': '1.666', 'DAMAGECOUNT': '1.761', 'weapon3': '1.932', 'FRAGCOUNT': '3.000'} [2024-08-05 08:02:08,957][00139] DAMAGECOUNT value on done: 130966.0 [2024-08-05 08:02:08,957][00139] Sum rewards: -0.191, reward structure: {'DEATHCOUNT': '-11.250', 'HEALTH': '-1.154', 'weapon4': '0.004', 'AMMO5': '0.007', 'ARMOR': '0.010', 'AMMO2': '0.020', 'weapon5': '0.086', 'AMMO4': '0.099', 'WEAPON4': '0.100', 'AMMO3': '0.141', 'WEAPON5': '0.150', 'HITCOUNT': '0.400', 'WEAPON3': '0.850', 'weapon3': '1.690', 'DAMAGECOUNT': '1.725', 'weapon2': '1.930', 'FRAGCOUNT': '5.000'} [2024-08-05 08:02:10,483][00034] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1110.8). Total num frames: 9469952. Throughput: 0: 274.7. Samples: 2369129. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 08:02:10,484][00034] Avg episode reward: [(0, '-2.523')] [2024-08-05 08:02:15,483][00034] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1138.5). Total num frames: 9478144. Throughput: 0: 274.2. Samples: 2369943. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 08:02:15,484][00034] Avg episode reward: [(0, '-2.523')] [2024-08-05 08:02:20,483][00034] Fps is (10 sec: 1638.4, 60 sec: 1092.3, 300 sec: 1138.5). Total num frames: 9486336. Throughput: 0: 273.3. Samples: 2371603. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 08:02:20,484][00034] Avg episode reward: [(0, '-2.523')] [2024-08-05 08:02:23,786][00139] DAMAGECOUNT value on done: 122599.0 [2024-08-05 08:02:24,018][00139] DAMAGECOUNT value on done: 131215.0 [2024-08-05 08:02:24,019][00139] Sum rewards: -5.162, reward structure: {'DEATHCOUNT': '-10.500', 'HEALTH': '-1.810', 'AMMO4': '-0.015', 'AMMO2': '-0.003', 'AMMO5': '0.017', 'ARMOR': '0.036', 'weapon5': '0.066', 'AMMO3': '0.143', 'HITCOUNT': '0.240', 'WEAPON5': '0.300', 'DAMAGECOUNT': '0.747', 'WEAPON3': '1.000', 'FRAGCOUNT': '1.000', 'weapon2': '1.534', 'weapon3': '2.082'} [2024-08-05 08:02:25,483][00034] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1110.8). Total num frames: 9486336. Throughput: 0: 274.4. Samples: 2373315. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 08:02:25,484][00034] Avg episode reward: [(0, '-2.496')] [2024-08-05 08:02:30,483][00034] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1110.8). Total num frames: 9494528. Throughput: 0: 277.6. Samples: 2374133. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 08:02:30,484][00034] Avg episode reward: [(0, '-2.496')] [2024-08-05 08:02:33,634][00138] Updated weights for policy 0, policy_version 1160 (0.0020) [2024-08-05 08:02:35,483][00034] Fps is (10 sec: 1638.4, 60 sec: 1092.3, 300 sec: 1138.5). Total num frames: 9502720. Throughput: 0: 278.1. Samples: 2375827. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 08:02:35,484][00034] Avg episode reward: [(0, '-2.496')] [2024-08-05 08:02:38,843][00139] DAMAGECOUNT value on done: 122970.0 [2024-08-05 08:02:38,844][00139] Sum rewards: -2.949, reward structure: {'DEATHCOUNT': '-9.000', 'HEALTH': '-0.760', 'AMMO2': '0.001', 'AMMO4': '0.006', 'weapon5': '0.012', 'AMMO5': '0.020', 'ARMOR': '0.044', 'WEAPON4': '0.050', 'AMMO3': '0.146', 'weapon4': '0.170', 'WEAPON5': '0.250', 'HITCOUNT': '0.280', 'FRAGCOUNT': '0.500', 'WEAPON3': '0.850', 'DAMAGECOUNT': '1.113', 'weapon2': '1.376', 'weapon3': '1.992'} [2024-08-05 08:02:39,075][00139] DAMAGECOUNT value on done: 131390.0 [2024-08-05 08:02:39,076][00139] Sum rewards: 0.765, reward structure: {'DEATHCOUNT': '-5.250', 'HEALTH': '-0.980', 'AMMO4': '-0.034', 'AMMO2': '-0.007', 'AMMO5': '0.005', 'weapon5': '0.102', 'AMMO3': '0.103', 'WEAPON5': '0.150', 'HITCOUNT': '0.160', 'ARMOR': '0.492', 'WEAPON3': '0.500', 'DAMAGECOUNT': '0.525', 'weapon2': '1.008', 'weapon3': '1.990', 'FRAGCOUNT': '2.000'} [2024-08-05 08:02:40,483][00034] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1110.8). Total num frames: 9502720. Throughput: 0: 279.0. Samples: 2377490. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 08:02:40,484][00034] Avg episode reward: [(0, '-2.453')] [2024-08-05 08:02:45,483][00034] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1110.8). Total num frames: 9510912. Throughput: 0: 277.5. Samples: 2378284. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 08:02:45,485][00034] Avg episode reward: [(0, '-2.453')] [2024-08-05 08:02:50,483][00034] Fps is (10 sec: 1638.4, 60 sec: 1092.3, 300 sec: 1138.5). Total num frames: 9519104. Throughput: 0: 276.2. Samples: 2379905. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 08:02:50,485][00034] Avg episode reward: [(0, '-2.453')] [2024-08-05 08:02:54,092][00139] DAMAGECOUNT value on done: 123265.0 [2024-08-05 08:02:54,093][00139] Sum rewards: -4.657, reward structure: {'DEATHCOUNT': '-9.000', 'HEALTH': '-2.136', 'AMMO4': '-0.052', 'AMMO2': '-0.010', 'weapon4': '0.002', 'AMMO5': '0.019', 'WEAPON1': '0.020', 'weapon5': '0.032', 'WEAPON4': '0.050', 'ARMOR': '0.082', 'AMMO3': '0.120', 'WEAPON5': '0.200', 'HITCOUNT': '0.220', 'FRAGCOUNT': '0.500', 'WEAPON3': '0.800', 'DAMAGECOUNT': '0.885', 'weapon3': '1.606', 'weapon2': '2.006'} [2024-08-05 08:02:54,340][00139] DAMAGECOUNT value on done: 131540.0 [2024-08-05 08:02:54,340][00139] Sum rewards: -3.024, reward structure: {'DEATHCOUNT': '-7.500', 'HEALTH': '-1.988', 'AMMO4': '-0.038', 'AMMO2': '-0.008', 'weapon4': '0.012', 'AMMO5': '0.012', 'WEAPON4': '0.100', 'HITCOUNT': '0.110', 'WEAPON5': '0.150', 'AMMO3': '0.190', 'DAMAGECOUNT': '0.450', 'ARMOR': '0.480', 'WEAPON3': '0.900', 'FRAGCOUNT': '1.000', 'weapon2': '1.482', 'weapon3': '1.622'} [2024-08-05 08:02:55,483][00034] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1110.8). Total num frames: 9519104. Throughput: 0: 277.5. Samples: 2381615. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 08:02:55,484][00034] Avg episode reward: [(0, '-2.561')] [2024-08-05 08:03:00,483][00034] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1110.8). Total num frames: 9527296. Throughput: 0: 278.1. Samples: 2382457. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 08:03:00,484][00034] Avg episode reward: [(0, '-2.561')] [2024-08-05 08:03:05,483][00034] Fps is (10 sec: 1638.4, 60 sec: 1092.3, 300 sec: 1138.5). Total num frames: 9535488. Throughput: 0: 278.9. Samples: 2384155. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 08:03:05,485][00034] Avg episode reward: [(0, '-2.561')] [2024-08-05 08:03:08,993][00139] DAMAGECOUNT value on done: 123558.0 [2024-08-05 08:03:08,993][00139] Sum rewards: 0.173, reward structure: {'DEATHCOUNT': '-6.000', 'HEALTH': '-0.610', 'AMMO2': '0.003', 'AMMO5': '0.006', 'AMMO4': '0.017', 'weapon7': '0.086', 'AMMO3': '0.088', 'weapon5': '0.100', 'AMMO6': '0.120', 'AMMO7': '0.120', 'WEAPON5': '0.150', 'HITCOUNT': '0.150', 'WEAPON7': '0.200', 'WEAPON3': '0.500', 'FRAGCOUNT': '0.500', 'ARMOR': '0.511', 'DAMAGECOUNT': '0.879', 'weapon2': '1.642', 'weapon3': '1.710'} [2024-08-05 08:03:09,226][00139] DAMAGECOUNT value on done: 131812.0 [2024-08-05 08:03:09,226][00139] Sum rewards: 0.542, reward structure: {'DEATHCOUNT': '-6.000', 'AMMO5': '0.003', 'AMMO2': '0.009', 'WEAPON1': '0.010', 'AMMO4': '0.046', 'WEAPON5': '0.050', 'weapon5': '0.052', 'weapon7': '0.056', 'HEALTH': '0.070', 'AMMO3': '0.082', 'AMMO6': '0.100', 'AMMO7': '0.100', 'WEAPON7': '0.100', 'HITCOUNT': '0.200', 'WEAPON3': '0.500', 'DAMAGECOUNT': '0.816', 'weapon3': '1.166', 'weapon2': '1.182', 'FRAGCOUNT': '2.000'} [2024-08-05 08:03:10,483][00034] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1110.8). Total num frames: 9535488. Throughput: 0: 278.5. Samples: 2385846. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 08:03:10,485][00034] Avg episode reward: [(0, '-2.582')] [2024-08-05 08:03:15,483][00034] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1110.8). Total num frames: 9543680. Throughput: 0: 278.4. Samples: 2386659. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 08:03:15,485][00034] Avg episode reward: [(0, '-2.582')] [2024-08-05 08:03:20,483][00034] Fps is (10 sec: 1638.4, 60 sec: 1092.3, 300 sec: 1138.5). Total num frames: 9551872. Throughput: 0: 276.0. Samples: 2388247. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 08:03:20,484][00034] Avg episode reward: [(0, '-2.582')] [2024-08-05 08:03:24,255][00139] DAMAGECOUNT value on done: 123748.0 [2024-08-05 08:03:24,255][00139] Sum rewards: -6.997, reward structure: {'DEATHCOUNT': '-9.750', 'FRAGCOUNT': '-2.000', 'HEALTH': '-1.404', 'weapon4': '0.010', 'AMMO2': '0.014', 'WEAPON1': '0.020', 'AMMO5': '0.025', 'ARMOR': '0.032', 'AMMO4': '0.072', 'WEAPON4': '0.100', 'HITCOUNT': '0.140', 'weapon5': '0.140', 'AMMO3': '0.164', 'WEAPON5': '0.450', 'DAMAGECOUNT': '0.570', 'WEAPON3': '0.900', 'weapon2': '1.522', 'weapon3': '1.998'} [2024-08-05 08:03:24,470][00139] DAMAGECOUNT value on done: 132257.0 [2024-08-05 08:03:24,470][00139] Sum rewards: 2.490, reward structure: {'DEATHCOUNT': '-6.750', 'HEALTH': '-0.474', 'AMMO2': '0.013', 'AMMO5': '0.018', 'WEAPON4': '0.050', 'AMMO4': '0.063', 'weapon5': '0.082', 'AMMO3': '0.116', 'WEAPON5': '0.150', 'HITCOUNT': '0.280', 'ARMOR': '0.400', 'WEAPON3': '0.550', 'DAMAGECOUNT': '1.335', 'weapon2': '1.506', 'weapon3': '1.652', 'FRAGCOUNT': '3.500'} [2024-08-05 08:03:25,483][00034] Fps is (10 sec: 1638.4, 60 sec: 1228.8, 300 sec: 1138.5). Total num frames: 9560064. Throughput: 0: 277.5. Samples: 2389977. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 08:03:25,484][00034] Avg episode reward: [(0, '-2.595')] [2024-08-05 08:03:30,483][00034] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1110.8). Total num frames: 9560064. Throughput: 0: 278.9. Samples: 2390836. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 08:03:30,484][00034] Avg episode reward: [(0, '-2.595')] [2024-08-05 08:03:35,483][00034] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1138.5). Total num frames: 9568256. Throughput: 0: 280.8. Samples: 2392542. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 08:03:35,485][00034] Avg episode reward: [(0, '-2.595')] [2024-08-05 08:03:35,493][00132] Saving /kaggle/working/train_dir/default_experiment/checkpoint_p0/checkpoint_000001168_9568256.pth... [2024-08-05 08:03:35,568][00132] Removing /kaggle/working/train_dir/default_experiment/checkpoint_p0/checkpoint_000001135_9297920.pth [2024-08-05 08:03:39,054][00139] DAMAGECOUNT value on done: 124014.0 [2024-08-05 08:03:39,055][00139] Sum rewards: -7.447, reward structure: {'DEATHCOUNT': '-10.500', 'FRAGCOUNT': '-1.500', 'HEALTH': '-1.298', 'AMMO2': '0.006', 'ARMOR': '0.016', 'AMMO5': '0.017', 'AMMO4': '0.029', 'weapon7': '0.048', 'weapon5': '0.050', 'AMMO6': '0.100', 'AMMO7': '0.100', 'WEAPON7': '0.100', 'WEAPON5': '0.200', 'AMMO3': '0.203', 'HITCOUNT': '0.220', 'DAMAGECOUNT': '0.798', 'WEAPON3': '0.850', 'weapon3': '1.294', 'weapon2': '1.820'} [2024-08-05 08:03:39,291][00139] DAMAGECOUNT value on done: 132628.0 [2024-08-05 08:03:39,291][00139] Sum rewards: -1.916, reward structure: {'DEATHCOUNT': '-8.250', 'HEALTH': '-0.407', 'AMMO5': '0.003', 'weapon5': '0.016', 'AMMO2': '0.017', 'ARMOR': '0.040', 'weapon7': '0.048', 'WEAPON5': '0.050', 'AMMO4': '0.084', 'AMMO3': '0.113', 'AMMO6': '0.120', 'AMMO7': '0.120', 'weapon4': '0.170', 'WEAPON4': '0.200', 'WEAPON7': '0.200', 'HITCOUNT': '0.240', 'FRAGCOUNT': '0.500', 'WEAPON3': '0.700', 'DAMAGECOUNT': '1.113', 'weapon2': '1.124', 'weapon3': '1.884'} [2024-08-05 08:03:40,483][00034] Fps is (10 sec: 1638.4, 60 sec: 1228.8, 300 sec: 1138.5). Total num frames: 9576448. Throughput: 0: 280.1. Samples: 2394220. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 08:03:40,485][00034] Avg episode reward: [(0, '-2.613')] [2024-08-05 08:03:45,483][00034] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1110.8). Total num frames: 9576448. Throughput: 0: 280.4. Samples: 2395077. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 08:03:45,484][00034] Avg episode reward: [(0, '-2.613')] [2024-08-05 08:03:46,955][00138] Updated weights for policy 0, policy_version 1170 (0.0018) [2024-08-05 08:03:50,483][00034] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1110.8). Total num frames: 9584640. Throughput: 0: 279.3. Samples: 2396724. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 08:03:50,486][00034] Avg episode reward: [(0, '-2.613')] [2024-08-05 08:03:54,039][00139] DAMAGECOUNT value on done: 124348.0 [2024-08-05 08:03:54,254][00139] DAMAGECOUNT value on done: 132929.0 [2024-08-05 08:03:54,255][00139] Sum rewards: -4.524, reward structure: {'DEATHCOUNT': '-11.250', 'HEALTH': '-1.706', 'AMMO5': '0.012', 'weapon5': '0.018', 'AMMO2': '0.024', 'weapon7': '0.038', 'weapon4': '0.048', 'ARMOR': '0.057', 'AMMO6': '0.100', 'AMMO7': '0.100', 'WEAPON7': '0.100', 'AMMO4': '0.118', 'WEAPON5': '0.150', 'AMMO3': '0.199', 'WEAPON4': '0.200', 'HITCOUNT': '0.230', 'DAMAGECOUNT': '0.903', 'weapon2': '1.058', 'WEAPON3': '1.250', 'FRAGCOUNT': '1.500', 'weapon3': '2.326'} [2024-08-05 08:03:55,483][00034] Fps is (10 sec: 1638.4, 60 sec: 1228.8, 300 sec: 1138.5). Total num frames: 9592832. Throughput: 0: 279.5. Samples: 2398425. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 08:03:55,484][00034] Avg episode reward: [(0, '-2.631')] [2024-08-05 08:04:00,483][00034] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1110.8). Total num frames: 9592832. Throughput: 0: 279.8. Samples: 2399248. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 08:04:00,484][00034] Avg episode reward: [(0, '-2.631')] [2024-08-05 08:04:05,483][00034] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1110.8). Total num frames: 9601024. Throughput: 0: 282.2. Samples: 2400945. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 08:04:05,484][00034] Avg episode reward: [(0, '-2.631')] [2024-08-05 08:04:09,114][00139] DAMAGECOUNT value on done: 124633.0 [2024-08-05 08:04:09,115][00139] Sum rewards: -0.508, reward structure: {'DEATHCOUNT': '-10.500', 'HEALTH': '-1.582', 'weapon5': '0.008', 'ARMOR': '0.008', 'AMMO2': '0.012', 'AMMO5': '0.020', 'AMMO4': '0.057', 'WEAPON4': '0.100', 'AMMO3': '0.198', 'HITCOUNT': '0.250', 'WEAPON5': '0.250', 'DAMAGECOUNT': '0.855', 'WEAPON3': '1.000', 'weapon2': '1.324', 'weapon3': '2.492', 'FRAGCOUNT': '5.000'} [2024-08-05 08:04:09,356][00139] DAMAGECOUNT value on done: 133159.0 [2024-08-05 08:04:09,356][00139] Sum rewards: -5.617, reward structure: {'DEATHCOUNT': '-10.500', 'HEALTH': '-2.080', 'AMMO4': '-0.081', 'AMMO2': '-0.016', 'AMMO5': '0.009', 'weapon5': '0.056', 'ARMOR': '0.107', 'AMMO3': '0.147', 'WEAPON5': '0.250', 'HITCOUNT': '0.250', 'DAMAGECOUNT': '0.690', 'WEAPON3': '0.950', 'FRAGCOUNT': '1.000', 'weapon2': '1.714', 'weapon3': '1.886'} [2024-08-05 08:04:10,483][00034] Fps is (10 sec: 1638.4, 60 sec: 1228.8, 300 sec: 1138.5). Total num frames: 9609216. Throughput: 0: 280.7. Samples: 2402609. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 08:04:10,484][00034] Avg episode reward: [(0, '-2.660')] [2024-08-05 08:04:15,483][00034] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1110.8). Total num frames: 9609216. Throughput: 0: 280.8. Samples: 2403471. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 08:04:15,484][00034] Avg episode reward: [(0, '-2.660')] [2024-08-05 08:04:20,483][00034] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1110.8). Total num frames: 9617408. Throughput: 0: 280.6. Samples: 2405168. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 08:04:20,484][00034] Avg episode reward: [(0, '-2.660')] [2024-08-05 08:04:24,064][00139] DAMAGECOUNT value on done: 124920.0 [2024-08-05 08:04:24,302][00139] DAMAGECOUNT value on done: 133349.0 [2024-08-05 08:04:24,303][00139] Sum rewards: -1.936, reward structure: {'DEATHCOUNT': '-6.750', 'HEALTH': '-1.950', 'AMMO4': '-0.055', 'AMMO2': '-0.011', 'WEAPON1': '0.020', 'AMMO5': '0.022', 'ARMOR': '0.040', 'WEAPON4': '0.100', 'weapon5': '0.102', 'AMMO3': '0.145', 'HITCOUNT': '0.160', 'weapon4': '0.198', 'WEAPON5': '0.400', 'DAMAGECOUNT': '0.570', 'WEAPON3': '0.900', 'FRAGCOUNT': '1.000', 'weapon2': '1.394', 'weapon3': '1.778'} [2024-08-05 08:04:25,483][00034] Fps is (10 sec: 1638.4, 60 sec: 1092.3, 300 sec: 1138.5). Total num frames: 9625600. Throughput: 0: 280.3. Samples: 2406834. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 08:04:25,485][00034] Avg episode reward: [(0, '-2.637')] [2024-08-05 08:04:30,483][00034] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1110.8). Total num frames: 9625600. Throughput: 0: 280.7. Samples: 2407709. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 08:04:30,484][00034] Avg episode reward: [(0, '-2.637')] [2024-08-05 08:04:35,483][00034] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1110.8). Total num frames: 9633792. Throughput: 0: 282.2. Samples: 2409425. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 08:04:35,485][00034] Avg episode reward: [(0, '-2.637')] [2024-08-05 08:04:38,703][00139] DAMAGECOUNT value on done: 125516.0 [2024-08-05 08:04:38,704][00139] Sum rewards: -1.872, reward structure: {'DEATHCOUNT': '-9.750', 'HEALTH': '-2.290', 'AMMO2': '0.001', 'AMMO4': '0.005', 'AMMO5': '0.009', 'WEAPON1': '0.020', 'ARMOR': '0.048', 'AMMO3': '0.123', 'weapon5': '0.184', 'WEAPON5': '0.200', 'WEAPON4': '0.250', 'weapon4': '0.324', 'HITCOUNT': '0.410', 'WEAPON3': '0.600', 'weapon3': '1.238', 'weapon2': '1.468', 'DAMAGECOUNT': '1.788', 'FRAGCOUNT': '3.500'} [2024-08-05 08:04:38,941][00139] DAMAGECOUNT value on done: 134508.0 [2024-08-05 08:04:38,942][00139] Sum rewards: 6.728, reward structure: {'DEATHCOUNT': '-8.250', 'HEALTH': '-0.048', 'AMMO5': '0.007', 'ARMOR': '0.008', 'AMMO2': '0.019', 'weapon7': '0.068', 'AMMO4': '0.092', 'AMMO6': '0.100', 'AMMO7': '0.100', 'WEAPON7': '0.100', 'WEAPON4': '0.100', 'AMMO3': '0.140', 'weapon5': '0.170', 'WEAPON5': '0.200', 'weapon4': '0.306', 'HITCOUNT': '0.330', 'WEAPON3': '0.600', 'weapon2': '1.486', 'weapon3': '1.602', 'DAMAGECOUNT': '2.598', 'FRAGCOUNT': '7.000'} [2024-08-05 08:04:40,483][00034] Fps is (10 sec: 1638.4, 60 sec: 1092.3, 300 sec: 1138.5). Total num frames: 9641984. Throughput: 0: 282.5. Samples: 2411137. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 08:04:40,485][00034] Avg episode reward: [(0, '-2.486')] [2024-08-05 08:04:45,483][00034] Fps is (10 sec: 1638.4, 60 sec: 1228.8, 300 sec: 1138.5). Total num frames: 9650176. Throughput: 0: 283.7. Samples: 2412013. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 08:04:45,484][00034] Avg episode reward: [(0, '-2.486')] [2024-08-05 08:04:50,483][00034] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1110.8). Total num frames: 9650176. Throughput: 0: 283.8. Samples: 2413714. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 08:04:50,484][00034] Avg episode reward: [(0, '-2.486')] [2024-08-05 08:04:53,638][00139] DAMAGECOUNT value on done: 125826.0 [2024-08-05 08:04:53,639][00139] Sum rewards: 0.669, reward structure: {'DEATHCOUNT': '-7.500', 'HEALTH': '-0.892', 'AMMO4': '-0.010', 'AMMO2': '-0.002', 'AMMO5': '0.017', 'AMMO3': '0.097', 'weapon5': '0.120', 'HITCOUNT': '0.170', 'WEAPON5': '0.300', 'ARMOR': '0.432', 'WEAPON3': '0.600', 'DAMAGECOUNT': '0.930', 'weapon3': '1.642', 'weapon2': '1.764', 'FRAGCOUNT': '3.000'} [2024-08-05 08:04:53,847][00139] DAMAGECOUNT value on done: 134758.0 [2024-08-05 08:04:53,848][00139] Sum rewards: -5.681, reward structure: {'DEATHCOUNT': '-11.250', 'HEALTH': '-1.189', 'AMMO2': '0.008', 'AMMO5': '0.013', 'ARMOR': '0.040', 'AMMO4': '0.042', 'weapon5': '0.102', 'AMMO3': '0.163', 'WEAPON5': '0.200', 'HITCOUNT': '0.260', 'FRAGCOUNT': '0.500', 'DAMAGECOUNT': '0.750', 'WEAPON3': '0.950', 'weapon2': '1.564', 'weapon3': '2.166'} [2024-08-05 08:04:55,483][00034] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1110.8). Total num frames: 9658368. Throughput: 0: 283.4. Samples: 2415364. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 08:04:55,484][00034] Avg episode reward: [(0, '-2.398')] [2024-08-05 08:04:59,661][00138] Updated weights for policy 0, policy_version 1180 (0.0017) [2024-08-05 08:05:00,483][00034] Fps is (10 sec: 1638.4, 60 sec: 1228.8, 300 sec: 1138.5). Total num frames: 9666560. Throughput: 0: 283.9. Samples: 2416245. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 08:05:00,484][00034] Avg episode reward: [(0, '-2.398')] [2024-08-05 08:05:05,483][00034] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1110.8). Total num frames: 9666560. Throughput: 0: 284.0. Samples: 2417949. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 08:05:05,484][00034] Avg episode reward: [(0, '-2.398')] [2024-08-05 08:05:08,271][00139] DAMAGECOUNT value on done: 126727.0 [2024-08-05 08:05:08,272][00139] Sum rewards: 7.609, reward structure: {'DEATHCOUNT': '-6.000', 'HEALTH': '-0.593', 'AMMO2': '0.001', 'AMMO4': '0.006', 'AMMO5': '0.016', 'ARMOR': '0.060', 'AMMO3': '0.085', 'WEAPON4': '0.100', 'weapon4': '0.110', 'weapon7': '0.122', 'weapon5': '0.160', 'WEAPON5': '0.200', 'AMMO6': '0.240', 'AMMO7': '0.240', 'WEAPON7': '0.400', 'HITCOUNT': '0.440', 'WEAPON3': '0.550', 'weapon2': '1.338', 'weapon3': '1.930', 'DAMAGECOUNT': '2.703', 'FRAGCOUNT': '5.500'} [2024-08-05 08:05:08,510][00139] DAMAGECOUNT value on done: 135198.0 [2024-08-05 08:05:08,510][00139] Sum rewards: -6.178, reward structure: {'DEATHCOUNT': '-13.500', 'HEALTH': '-1.662', 'AMMO5': '0.005', 'weapon4': '0.006', 'AMMO2': '0.018', 'weapon5': '0.022', 'ARMOR': '0.040', 'AMMO4': '0.090', 'WEAPON4': '0.100', 'WEAPON5': '0.100', 'AMMO3': '0.199', 'HITCOUNT': '0.340', 'WEAPON3': '1.200', 'DAMAGECOUNT': '1.320', 'weapon2': '1.556', 'weapon3': '1.988', 'FRAGCOUNT': '2.000'} [2024-08-05 08:05:10,483][00034] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1110.8). Total num frames: 9674752. Throughput: 0: 284.8. Samples: 2419648. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 08:05:10,485][00034] Avg episode reward: [(0, '-2.313')] [2024-08-05 08:05:15,483][00034] Fps is (10 sec: 1638.4, 60 sec: 1228.8, 300 sec: 1138.5). Total num frames: 9682944. Throughput: 0: 284.4. Samples: 2420506. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 08:05:15,485][00034] Avg episode reward: [(0, '-2.313')] [2024-08-05 08:05:20,483][00034] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1110.8). Total num frames: 9682944. Throughput: 0: 285.2. Samples: 2422258. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 08:05:20,484][00034] Avg episode reward: [(0, '-2.313')] [2024-08-05 08:05:22,953][00139] DAMAGECOUNT value on done: 126881.0 [2024-08-05 08:05:22,954][00139] Sum rewards: -6.100, reward structure: {'DEATHCOUNT': '-9.750', 'HEALTH': '-1.692', 'FRAGCOUNT': '-0.500', 'AMMO5': '0.013', 'AMMO2': '0.028', 'ARMOR': '0.040', 'weapon5': '0.042', 'AMMO3': '0.125', 'AMMO4': '0.139', 'HITCOUNT': '0.150', 'WEAPON5': '0.200', 'WEAPON4': '0.300', 'weapon4': '0.398', 'DAMAGECOUNT': '0.462', 'WEAPON3': '0.850', 'weapon2': '1.374', 'weapon3': '1.722'} [2024-08-05 08:05:23,222][00139] DAMAGECOUNT value on done: 135495.0 [2024-08-05 08:05:23,222][00139] Sum rewards: -1.065, reward structure: {'DEATHCOUNT': '-6.750', 'HEALTH': '-1.389', 'AMMO5': '0.019', 'WEAPON1': '0.020', 'AMMO2': '0.020', 'ARMOR': '0.040', 'weapon7': '0.072', 'weapon5': '0.090', 'AMMO4': '0.100', 'AMMO6': '0.100', 'AMMO7': '0.100', 'WEAPON7': '0.100', 'AMMO3': '0.109', 'HITCOUNT': '0.150', 'WEAPON4': '0.150', 'weapon4': '0.208', 'WEAPON5': '0.350', 'WEAPON3': '0.500', 'DAMAGECOUNT': '0.891', 'weapon2': '1.192', 'weapon3': '1.362', 'FRAGCOUNT': '1.500'} [2024-08-05 08:05:25,483][00034] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1110.8). Total num frames: 9691136. Throughput: 0: 284.1. Samples: 2423920. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 08:05:25,484][00034] Avg episode reward: [(0, '-2.306')] [2024-08-05 08:05:30,483][00034] Fps is (10 sec: 1638.4, 60 sec: 1228.8, 300 sec: 1138.5). Total num frames: 9699328. Throughput: 0: 284.2. Samples: 2424802. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 08:05:30,485][00034] Avg episode reward: [(0, '-2.306')] [2024-08-05 08:05:35,483][00034] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1110.8). Total num frames: 9699328. Throughput: 0: 284.2. Samples: 2426504. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 08:05:35,486][00034] Avg episode reward: [(0, '-2.306')] [2024-08-05 08:05:35,505][00132] Saving /kaggle/working/train_dir/default_experiment/checkpoint_p0/checkpoint_000001185_9707520.pth... [2024-08-05 08:05:35,583][00132] Removing /kaggle/working/train_dir/default_experiment/checkpoint_p0/checkpoint_000001152_9437184.pth [2024-08-05 08:05:37,771][00139] DAMAGECOUNT value on done: 127430.0 [2024-08-05 08:05:37,771][00139] Sum rewards: -2.354, reward structure: {'DEATHCOUNT': '-10.500', 'HEALTH': '-0.588', 'AMMO5': '0.007', 'AMMO2': '0.019', 'ARMOR': '0.060', 'weapon5': '0.074', 'AMMO4': '0.095', 'WEAPON4': '0.100', 'WEAPON5': '0.150', 'AMMO3': '0.169', 'HITCOUNT': '0.270', 'WEAPON3': '0.950', 'weapon2': '1.178', 'DAMAGECOUNT': '1.647', 'FRAGCOUNT': '2.000', 'weapon3': '2.014'} [2024-08-05 08:05:37,988][00139] DAMAGECOUNT value on done: 135710.0 [2024-08-05 08:05:40,483][00034] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1110.8). Total num frames: 9707520. Throughput: 0: 285.2. Samples: 2428200. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 08:05:40,484][00034] Avg episode reward: [(0, '-2.306')] [2024-08-05 08:05:45,483][00034] Fps is (10 sec: 1638.4, 60 sec: 1092.3, 300 sec: 1138.5). Total num frames: 9715712. Throughput: 0: 284.3. Samples: 2429038. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 08:05:45,485][00034] Avg episode reward: [(0, '-2.306')] [2024-08-05 08:05:50,483][00034] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1110.8). Total num frames: 9715712. Throughput: 0: 281.3. Samples: 2430607. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 08:05:50,484][00034] Avg episode reward: [(0, '-2.306')] [2024-08-05 08:05:53,118][00139] DAMAGECOUNT value on done: 128004.0 [2024-08-05 08:05:53,119][00139] Sum rewards: -7.057, reward structure: {'DEATHCOUNT': '-14.250', 'HEALTH': '-2.665', 'AMMO2': '0.013', 'AMMO5': '0.016', 'weapon4': '0.044', 'AMMO4': '0.065', 'WEAPON4': '0.100', 'weapon7': '0.104', 'weapon5': '0.156', 'WEAPON5': '0.200', 'AMMO3': '0.222', 'AMMO6': '0.280', 'AMMO7': '0.280', 'HITCOUNT': '0.290', 'WEAPON7': '0.400', 'WEAPON3': '1.150', 'weapon3': '1.352', 'FRAGCOUNT': '1.500', 'DAMAGECOUNT': '1.722', 'weapon2': '1.964'} [2024-08-05 08:05:53,334][00139] DAMAGECOUNT value on done: 135931.0 [2024-08-05 08:05:53,335][00139] Sum rewards: 2.997, reward structure: {'DEATHCOUNT': '-5.250', 'AMMO5': '0.010', 'AMMO2': '0.014', 'weapon5': '0.042', 'WEAPON4': '0.050', 'AMMO4': '0.071', 'weapon4': '0.086', 'HITCOUNT': '0.100', 'AMMO3': '0.106', 'WEAPON5': '0.150', 'HEALTH': '0.452', 'ARMOR': '0.455', 'WEAPON3': '0.500', 'DAMAGECOUNT': '0.663', 'weapon2': '1.400', 'FRAGCOUNT': '2.000', 'weapon3': '2.148'} [2024-08-05 08:05:55,483][00034] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1110.8). Total num frames: 9723904. Throughput: 0: 280.2. Samples: 2432257. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 08:05:55,484][00034] Avg episode reward: [(0, '-2.236')] [2024-08-05 08:06:00,483][00034] Fps is (10 sec: 1638.4, 60 sec: 1092.3, 300 sec: 1110.8). Total num frames: 9732096. Throughput: 0: 279.9. Samples: 2433100. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 08:06:00,484][00034] Avg episode reward: [(0, '-2.236')] [2024-08-05 08:06:05,483][00034] Fps is (10 sec: 1638.4, 60 sec: 1228.8, 300 sec: 1138.5). Total num frames: 9740288. Throughput: 0: 278.4. Samples: 2434788. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 08:06:05,484][00034] Avg episode reward: [(0, '-2.236')] [2024-08-05 08:06:08,309][00139] DAMAGECOUNT value on done: 128295.0 [2024-08-05 08:06:08,310][00139] Sum rewards: -4.481, reward structure: {'DEATHCOUNT': '-11.250', 'HEALTH': '-0.704', 'AMMO2': '0.008', 'AMMO5': '0.015', 'WEAPON1': '0.020', 'AMMO4': '0.041', 'WEAPON4': '0.100', 'HITCOUNT': '0.120', 'AMMO3': '0.125', 'weapon5': '0.138', 'weapon4': '0.144', 'AMMO6': '0.200', 'WEAPON7': '0.200', 'AMMO7': '0.200', 'WEAPON5': '0.250', 'ARMOR': '0.420', 'FRAGCOUNT': '0.500', 'WEAPON3': '0.700', 'DAMAGECOUNT': '0.873', 'weapon2': '1.656', 'weapon3': '1.762'} [2024-08-05 08:06:08,563][00139] DAMAGECOUNT value on done: 136341.0 [2024-08-05 08:06:08,564][00139] Sum rewards: 2.670, reward structure: {'DEATHCOUNT': '-5.250', 'HEALTH': '-1.160', 'AMMO4': '-0.001', 'AMMO2': '-0.000', 'AMMO5': '0.019', 'WEAPON1': '0.030', 'AMMO3': '0.090', 'HITCOUNT': '0.170', 'weapon5': '0.404', 'WEAPON5': '0.500', 'WEAPON3': '0.550', 'DAMAGECOUNT': '1.230', 'weapon3': '1.282', 'weapon2': '1.806', 'FRAGCOUNT': '3.000'} [2024-08-05 08:06:10,483][00034] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1110.8). Total num frames: 9740288. Throughput: 0: 277.5. Samples: 2436409. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 08:06:10,484][00034] Avg episode reward: [(0, '-2.268')] [2024-08-05 08:06:12,559][00138] Updated weights for policy 0, policy_version 1190 (0.0017) [2024-08-05 08:06:15,483][00034] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1110.8). Total num frames: 9748480. Throughput: 0: 277.1. Samples: 2437271. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 08:06:15,484][00034] Avg episode reward: [(0, '-2.268')] [2024-08-05 08:06:20,483][00034] Fps is (10 sec: 1638.4, 60 sec: 1228.8, 300 sec: 1138.5). Total num frames: 9756672. Throughput: 0: 277.8. Samples: 2439004. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 08:06:20,484][00034] Avg episode reward: [(0, '-2.268')] [2024-08-05 08:06:23,152][00139] DAMAGECOUNT value on done: 128490.0 [2024-08-05 08:06:23,153][00139] Sum rewards: -4.340, reward structure: {'DEATHCOUNT': '-9.750', 'HEALTH': '-1.994', 'AMMO5': '0.013', 'AMMO2': '0.032', 'weapon5': '0.032', 'ARMOR': '0.117', 'AMMO4': '0.157', 'AMMO3': '0.166', 'WEAPON4': '0.200', 'weapon4': '0.204', 'HITCOUNT': '0.220', 'WEAPON5': '0.250', 'DAMAGECOUNT': '0.585', 'WEAPON3': '0.950', 'FRAGCOUNT': '1.000', 'weapon2': '1.516', 'weapon3': '1.962'} [2024-08-05 08:06:23,369][00139] DAMAGECOUNT value on done: 136851.0 [2024-08-05 08:06:23,370][00139] Sum rewards: -2.442, reward structure: {'DEATHCOUNT': '-12.000', 'HEALTH': '-2.750', 'AMMO5': '0.015', 'AMMO2': '0.022', 'weapon5': '0.030', 'AMMO4': '0.110', 'AMMO3': '0.150', 'weapon4': '0.168', 'WEAPON5': '0.250', 'WEAPON4': '0.250', 'HITCOUNT': '0.300', 'ARMOR': '0.400', 'WEAPON3': '0.750', 'weapon3': '1.304', 'DAMAGECOUNT': '1.530', 'weapon2': '2.028', 'FRAGCOUNT': '5.000'} [2024-08-05 08:06:25,483][00034] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1110.8). Total num frames: 9756672. Throughput: 0: 277.7. Samples: 2440698. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 08:06:25,484][00034] Avg episode reward: [(0, '-2.265')] [2024-08-05 08:06:30,483][00034] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1110.8). Total num frames: 9764864. Throughput: 0: 276.3. Samples: 2441472. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 08:06:30,484][00034] Avg episode reward: [(0, '-2.265')] [2024-08-05 08:06:35,483][00034] Fps is (10 sec: 1638.4, 60 sec: 1228.8, 300 sec: 1138.5). Total num frames: 9773056. Throughput: 0: 279.6. Samples: 2443191. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 08:06:35,485][00034] Avg episode reward: [(0, '-2.265')] [2024-08-05 08:06:38,329][00139] DAMAGECOUNT value on done: 128914.0 [2024-08-05 08:06:38,329][00139] Sum rewards: -1.765, reward structure: {'DEATHCOUNT': '-10.500', 'HEALTH': '-0.970', 'ARMOR': '0.004', 'AMMO5': '0.017', 'AMMO2': '0.025', 'weapon5': '0.082', 'AMMO4': '0.123', 'WEAPON5': '0.200', 'AMMO3': '0.201', 'HITCOUNT': '0.280', 'WEAPON3': '0.900', 'DAMAGECOUNT': '1.272', 'weapon2': '1.720', 'weapon3': '1.880', 'FRAGCOUNT': '3.000'} [2024-08-05 08:06:38,562][00139] DAMAGECOUNT value on done: 137038.0 [2024-08-05 08:06:38,563][00139] Sum rewards: -1.584, reward structure: {'DEATHCOUNT': '-7.500', 'HEALTH': '-0.729', 'AMMO5': '0.003', 'AMMO2': '0.018', 'WEAPON5': '0.050', 'weapon5': '0.060', 'ARMOR': '0.072', 'AMMO4': '0.090', 'weapon4': '0.092', 'AMMO3': '0.127', 'WEAPON4': '0.150', 'HITCOUNT': '0.160', 'DAMAGECOUNT': '0.561', 'WEAPON3': '0.750', 'FRAGCOUNT': '1.000', 'weapon2': '1.618', 'weapon3': '1.894'} [2024-08-05 08:06:40,483][00034] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1110.8). Total num frames: 9773056. Throughput: 0: 279.6. Samples: 2444837. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 08:06:40,485][00034] Avg episode reward: [(0, '-2.226')] [2024-08-05 08:06:45,483][00034] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1110.8). Total num frames: 9781248. Throughput: 0: 279.1. Samples: 2445661. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 08:06:45,485][00034] Avg episode reward: [(0, '-2.226')] [2024-08-05 08:06:50,483][00034] Fps is (10 sec: 1638.4, 60 sec: 1228.8, 300 sec: 1138.5). Total num frames: 9789440. Throughput: 0: 279.6. Samples: 2447371. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 08:06:50,485][00034] Avg episode reward: [(0, '-2.226')] [2024-08-05 08:06:53,123][00139] DAMAGECOUNT value on done: 129156.0 [2024-08-05 08:06:53,124][00139] Sum rewards: -0.475, reward structure: {'DEATHCOUNT': '-8.250', 'HEALTH': '-0.252', 'AMMO5': '0.003', 'AMMO2': '0.017', 'WEAPON5': '0.050', 'ARMOR': '0.084', 'AMMO4': '0.085', 'WEAPON4': '0.100', 'AMMO3': '0.147', 'HITCOUNT': '0.210', 'DAMAGECOUNT': '0.726', 'WEAPON3': '0.750', 'weapon2': '1.392', 'FRAGCOUNT': '2.000', 'weapon3': '2.464'} [2024-08-05 08:06:53,352][00139] DAMAGECOUNT value on done: 137465.0 [2024-08-05 08:06:53,353][00139] Sum rewards: 3.347, reward structure: {'DEATHCOUNT': '-6.750', 'HEALTH': '-0.090', 'AMMO2': '0.029', 'ARMOR': '0.070', 'AMMO3': '0.092', 'AMMO4': '0.145', 'weapon4': '0.272', 'WEAPON4': '0.300', 'HITCOUNT': '0.380', 'WEAPON3': '0.600', 'DAMAGECOUNT': '1.281', 'weapon2': '1.364', 'weapon3': '1.654', 'FRAGCOUNT': '4.000'} [2024-08-05 08:06:55,483][00034] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1110.8). Total num frames: 9789440. Throughput: 0: 281.7. Samples: 2449087. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 08:06:55,485][00034] Avg episode reward: [(0, '-2.203')] [2024-08-05 08:07:00,483][00034] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1110.8). Total num frames: 9797632. Throughput: 0: 279.9. Samples: 2449865. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 08:07:00,484][00034] Avg episode reward: [(0, '-2.203')] [2024-08-05 08:07:05,483][00034] Fps is (10 sec: 1638.4, 60 sec: 1092.3, 300 sec: 1138.5). Total num frames: 9805824. Throughput: 0: 278.7. Samples: 2451544. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 08:07:05,484][00034] Avg episode reward: [(0, '-2.203')] [2024-08-05 08:07:08,384][00139] DAMAGECOUNT value on done: 129481.0 [2024-08-05 08:07:08,385][00139] Sum rewards: 0.499, reward structure: {'DEATHCOUNT': '-8.250', 'HEALTH': '-0.175', 'AMMO5': '0.005', 'AMMO2': '0.017', 'ARMOR': '0.036', 'WEAPON5': '0.050', 'AMMO4': '0.082', 'AMMO3': '0.129', 'WEAPON4': '0.150', 'AMMO6': '0.200', 'WEAPON7': '0.200', 'AMMO7': '0.200', 'weapon4': '0.246', 'HITCOUNT': '0.250', 'WEAPON3': '0.850', 'DAMAGECOUNT': '0.975', 'weapon2': '1.176', 'FRAGCOUNT': '2.000', 'weapon3': '2.358'} [2024-08-05 08:07:08,622][00139] DAMAGECOUNT value on done: 138334.0 [2024-08-05 08:07:08,623][00139] Sum rewards: 6.356, reward structure: {'DEATHCOUNT': '-7.500', 'HEALTH': '-0.394', 'AMMO5': '0.017', 'WEAPON1': '0.020', 'AMMO2': '0.023', 'ARMOR': '0.040', 'WEAPON4': '0.050', 'weapon7': '0.052', 'AMMO4': '0.113', 'AMMO6': '0.120', 'AMMO7': '0.120', 'weapon5': '0.126', 'AMMO3': '0.131', 'WEAPON7': '0.200', 'WEAPON5': '0.300', 'HITCOUNT': '0.410', 'WEAPON3': '0.650', 'weapon2': '1.616', 'weapon3': '1.654', 'DAMAGECOUNT': '2.607', 'FRAGCOUNT': '6.000'} [2024-08-05 08:07:10,483][00034] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1110.8). Total num frames: 9805824. Throughput: 0: 277.7. Samples: 2453196. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 08:07:10,484][00034] Avg episode reward: [(0, '-2.012')] [2024-08-05 08:07:15,483][00034] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1110.8). Total num frames: 9814016. Throughput: 0: 279.6. Samples: 2454052. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 08:07:15,487][00034] Avg episode reward: [(0, '-2.012')] [2024-08-05 08:07:20,483][00034] Fps is (10 sec: 1638.4, 60 sec: 1092.3, 300 sec: 1138.5). Total num frames: 9822208. Throughput: 0: 278.7. Samples: 2455732. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 08:07:20,484][00034] Avg episode reward: [(0, '-2.012')] [2024-08-05 08:07:23,446][00139] DAMAGECOUNT value on done: 129503.0 [2024-08-05 08:07:23,681][00139] DAMAGECOUNT value on done: 139074.0 [2024-08-05 08:07:23,681][00139] Sum rewards: 0.127, reward structure: {'DEATHCOUNT': '-9.750', 'HEALTH': '-0.658', 'AMMO2': '0.005', 'AMMO4': '0.027', 'ARMOR': '0.040', 'weapon7': '0.072', 'AMMO6': '0.100', 'AMMO7': '0.100', 'WEAPON7': '0.100', 'AMMO3': '0.129', 'HITCOUNT': '0.480', 'WEAPON3': '0.800', 'weapon2': '1.548', 'weapon3': '1.914', 'DAMAGECOUNT': '2.220', 'FRAGCOUNT': '3.000'} [2024-08-05 08:07:25,483][00034] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1110.8). Total num frames: 9822208. Throughput: 0: 279.7. Samples: 2457425. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 08:07:25,484][00034] Avg episode reward: [(0, '-1.939')] [2024-08-05 08:07:25,845][00138] Updated weights for policy 0, policy_version 1200 (0.0017) [2024-08-05 08:07:30,483][00034] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1110.8). Total num frames: 9830400. Throughput: 0: 279.3. Samples: 2458231. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 08:07:30,484][00034] Avg episode reward: [(0, '-1.939')] [2024-08-05 08:07:35,483][00034] Fps is (10 sec: 1638.4, 60 sec: 1092.3, 300 sec: 1138.5). Total num frames: 9838592. Throughput: 0: 276.9. Samples: 2459830. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 08:07:35,484][00034] Avg episode reward: [(0, '-1.939')] [2024-08-05 08:07:35,492][00132] Saving /kaggle/working/train_dir/default_experiment/checkpoint_p0/checkpoint_000001201_9838592.pth... [2024-08-05 08:07:35,568][00132] Removing /kaggle/working/train_dir/default_experiment/checkpoint_p0/checkpoint_000001168_9568256.pth [2024-08-05 08:07:38,800][00139] DAMAGECOUNT value on done: 129762.0 [2024-08-05 08:07:38,801][00139] Sum rewards: -8.812, reward structure: {'DEATHCOUNT': '-9.750', 'FRAGCOUNT': '-3.500', 'HEALTH': '-1.550', 'AMMO2': '0.000', 'AMMO4': '0.001', 'AMMO5': '0.013', 'WEAPON1': '0.020', 'WEAPON4': '0.100', 'weapon5': '0.128', 'AMMO3': '0.162', 'HITCOUNT': '0.210', 'weapon4': '0.228', 'WEAPON5': '0.300', 'ARMOR': '0.536', 'DAMAGECOUNT': '0.777', 'WEAPON3': '0.800', 'weapon2': '1.328', 'weapon3': '1.384'} [2024-08-05 08:07:39,058][00139] DAMAGECOUNT value on done: 139554.0 [2024-08-05 08:07:39,058][00139] Sum rewards: -3.472, reward structure: {'DEATHCOUNT': '-11.250', 'HEALTH': '-1.876', 'AMMO2': '0.005', 'AMMO5': '0.008', 'weapon4': '0.016', 'weapon5': '0.018', 'AMMO4': '0.023', 'ARMOR': '0.032', 'WEAPON4': '0.050', 'WEAPON5': '0.100', 'AMMO3': '0.173', 'HITCOUNT': '0.380', 'WEAPON3': '1.150', 'DAMAGECOUNT': '1.440', 'weapon2': '1.482', 'weapon3': '2.278', 'FRAGCOUNT': '2.500'} [2024-08-05 08:07:40,483][00034] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1110.8). Total num frames: 9838592. Throughput: 0: 276.2. Samples: 2461517. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 08:07:40,484][00034] Avg episode reward: [(0, '-1.998')] [2024-08-05 08:07:45,483][00034] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1110.8). Total num frames: 9846784. Throughput: 0: 277.6. Samples: 2462356. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 08:07:45,484][00034] Avg episode reward: [(0, '-1.998')] [2024-08-05 08:07:50,483][00034] Fps is (10 sec: 1638.4, 60 sec: 1092.3, 300 sec: 1138.5). Total num frames: 9854976. Throughput: 0: 278.0. Samples: 2464055. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 08:07:50,484][00034] Avg episode reward: [(0, '-1.998')] [2024-08-05 08:07:53,701][00139] DAMAGECOUNT value on done: 130050.0 [2024-08-05 08:07:53,701][00139] Sum rewards: -4.297, reward structure: {'DEATHCOUNT': '-11.250', 'HEALTH': '-3.110', 'AMMO4': '-0.010', 'AMMO2': '-0.002', 'ARMOR': '0.004', 'AMMO5': '0.013', 'weapon5': '0.050', 'AMMO3': '0.178', 'AMMO6': '0.200', 'WEAPON7': '0.200', 'AMMO7': '0.200', 'WEAPON5': '0.250', 'HITCOUNT': '0.280', 'DAMAGECOUNT': '0.864', 'WEAPON3': '1.200', 'weapon2': '1.398', 'weapon3': '2.238', 'FRAGCOUNT': '3.000'} [2024-08-05 08:07:53,944][00139] DAMAGECOUNT value on done: 139843.0 [2024-08-05 08:07:53,944][00139] Sum rewards: -2.442, reward structure: {'DEATHCOUNT': '-9.000', 'HEALTH': '-0.772', 'AMMO5': '0.012', 'AMMO2': '0.018', 'weapon5': '0.062', 'AMMO4': '0.088', 'weapon7': '0.096', 'AMMO6': '0.100', 'AMMO7': '0.100', 'WEAPON7': '0.100', 'AMMO3': '0.136', 'WEAPON5': '0.200', 'WEAPON4': '0.200', 'weapon4': '0.260', 'HITCOUNT': '0.310', 'ARMOR': '0.412', 'FRAGCOUNT': '0.500', 'WEAPON3': '0.650', 'DAMAGECOUNT': '0.867', 'weapon3': '1.518', 'weapon2': '1.700'} [2024-08-05 08:07:55,483][00034] Fps is (10 sec: 1638.4, 60 sec: 1228.8, 300 sec: 1138.5). Total num frames: 9863168. Throughput: 0: 278.6. Samples: 2465734. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 08:07:55,484][00034] Avg episode reward: [(0, '-1.989')] [2024-08-05 08:08:00,483][00034] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1110.8). Total num frames: 9863168. Throughput: 0: 278.3. Samples: 2466577. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 08:08:00,484][00034] Avg episode reward: [(0, '-1.989')] [2024-08-05 08:08:05,483][00034] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1138.5). Total num frames: 9871360. Throughput: 0: 277.1. Samples: 2468201. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 08:08:05,485][00034] Avg episode reward: [(0, '-1.989')] [2024-08-05 08:08:08,635][00139] DAMAGECOUNT value on done: 130265.0 [2024-08-05 08:08:08,636][00139] Sum rewards: -6.417, reward structure: {'DEATHCOUNT': '-8.250', 'FRAGCOUNT': '-2.000', 'HEALTH': '-1.323', 'ARMOR': '0.016', 'AMMO5': '0.017', 'WEAPON1': '0.020', 'AMMO2': '0.026', 'WEAPON4': '0.050', 'weapon5': '0.088', 'AMMO4': '0.130', 'HITCOUNT': '0.140', 'weapon4': '0.142', 'AMMO3': '0.146', 'WEAPON5': '0.250', 'DAMAGECOUNT': '0.645', 'WEAPON3': '0.850', 'weapon2': '1.102', 'weapon3': '1.534'} [2024-08-05 08:08:08,916][00139] DAMAGECOUNT value on done: 140232.0 [2024-08-05 08:08:08,916][00139] Sum rewards: -5.717, reward structure: {'DEATHCOUNT': '-14.250', 'HEALTH': '-2.620', 'weapon5': '0.004', 'AMMO2': '0.005', 'AMMO5': '0.025', 'AMMO4': '0.027', 'ARMOR': '0.040', 'WEAPON1': '0.040', 'AMMO3': '0.219', 'HITCOUNT': '0.280', 'WEAPON5': '0.350', 'DAMAGECOUNT': '1.167', 'WEAPON3': '1.250', 'weapon2': '1.810', 'weapon3': '1.936', 'FRAGCOUNT': '4.000'} [2024-08-05 08:08:10,483][00034] Fps is (10 sec: 1638.4, 60 sec: 1228.8, 300 sec: 1138.5). Total num frames: 9879552. Throughput: 0: 277.7. Samples: 2469923. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 08:08:10,485][00034] Avg episode reward: [(0, '-2.054')] [2024-08-05 08:08:15,483][00034] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1110.8). Total num frames: 9879552. Throughput: 0: 278.7. Samples: 2470772. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 08:08:15,484][00034] Avg episode reward: [(0, '-2.054')] [2024-08-05 08:08:20,483][00034] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1110.8). Total num frames: 9887744. Throughput: 0: 280.8. Samples: 2472464. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 08:08:20,484][00034] Avg episode reward: [(0, '-2.054')] [2024-08-05 08:08:23,374][00139] DAMAGECOUNT value on done: 130496.0 [2024-08-05 08:08:23,374][00139] Sum rewards: -10.528, reward structure: {'DEATHCOUNT': '-12.000', 'FRAGCOUNT': '-3.500', 'HEALTH': '-1.370', 'AMMO2': '0.013', 'AMMO5': '0.016', 'ARMOR': '0.056', 'AMMO4': '0.063', 'weapon7': '0.068', 'HITCOUNT': '0.110', 'AMMO6': '0.120', 'AMMO7': '0.120', 'AMMO3': '0.148', 'WEAPON7': '0.200', 'weapon5': '0.238', 'WEAPON5': '0.400', 'DAMAGECOUNT': '0.693', 'WEAPON3': '0.700', 'weapon3': '1.526', 'weapon2': '1.870'} [2024-08-05 08:08:23,610][00139] DAMAGECOUNT value on done: 140562.0 [2024-08-05 08:08:23,610][00139] Sum rewards: -3.428, reward structure: {'DEATHCOUNT': '-10.500', 'HEALTH': '-2.056', 'AMMO4': '-0.006', 'AMMO2': '-0.001', 'weapon5': '0.006', 'AMMO5': '0.017', 'weapon7': '0.076', 'AMMO6': '0.100', 'AMMO7': '0.100', 'WEAPON7': '0.100', 'AMMO3': '0.162', 'WEAPON5': '0.250', 'HITCOUNT': '0.310', 'ARMOR': '0.428', 'WEAPON3': '0.950', 'DAMAGECOUNT': '0.990', 'weapon3': '1.770', 'weapon2': '1.876', 'FRAGCOUNT': '2.000'} [2024-08-05 08:08:25,483][00034] Fps is (10 sec: 1638.4, 60 sec: 1228.8, 300 sec: 1138.5). Total num frames: 9895936. Throughput: 0: 282.4. Samples: 2474224. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 08:08:25,484][00034] Avg episode reward: [(0, '-2.145')] [2024-08-05 08:08:30,483][00034] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1110.8). Total num frames: 9895936. Throughput: 0: 281.7. Samples: 2475031. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 08:08:30,485][00034] Avg episode reward: [(0, '-2.145')] [2024-08-05 08:08:35,483][00034] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1110.8). Total num frames: 9904128. Throughput: 0: 280.9. Samples: 2476694. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 08:08:35,485][00034] Avg episode reward: [(0, '-2.145')] [2024-08-05 08:08:38,435][00139] DAMAGECOUNT value on done: 130671.0 [2024-08-05 08:08:38,436][00139] Sum rewards: -4.183, reward structure: {'DEATHCOUNT': '-9.750', 'HEALTH': '-1.450', 'AMMO2': '0.007', 'weapon5': '0.008', 'AMMO5': '0.015', 'weapon4': '0.016', 'WEAPON1': '0.020', 'AMMO4': '0.032', 'WEAPON4': '0.100', 'AMMO3': '0.142', 'HITCOUNT': '0.150', 'WEAPON5': '0.200', 'ARMOR': '0.404', 'DAMAGECOUNT': '0.525', 'WEAPON3': '0.750', 'FRAGCOUNT': '1.000', 'weapon2': '1.666', 'weapon3': '1.982'} [2024-08-05 08:08:38,669][00139] DAMAGECOUNT value on done: 141063.0 [2024-08-05 08:08:38,670][00139] Sum rewards: 0.620, reward structure: {'DEATHCOUNT': '-6.750', 'HEALTH': '-0.867', 'AMMO4': '-0.014', 'AMMO2': '-0.003', 'AMMO5': '0.006', 'AMMO3': '0.109', 'weapon5': '0.120', 'WEAPON5': '0.150', 'HITCOUNT': '0.310', 'WEAPON3': '0.700', 'weapon3': '1.306', 'DAMAGECOUNT': '1.503', 'FRAGCOUNT': '2.000', 'weapon2': '2.050'} [2024-08-05 08:08:39,096][00138] Updated weights for policy 0, policy_version 1210 (0.0018) [2024-08-05 08:08:40,483][00034] Fps is (10 sec: 1638.4, 60 sec: 1228.8, 300 sec: 1138.5). Total num frames: 9912320. Throughput: 0: 281.2. Samples: 2478387. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 08:08:40,484][00034] Avg episode reward: [(0, '-2.151')] [2024-08-05 08:08:45,483][00034] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1110.8). Total num frames: 9912320. Throughput: 0: 281.2. Samples: 2479230. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 08:08:45,484][00034] Avg episode reward: [(0, '-2.151')] [2024-08-05 08:08:50,483][00034] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1110.8). Total num frames: 9920512. Throughput: 0: 282.6. Samples: 2480917. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 08:08:50,485][00034] Avg episode reward: [(0, '-2.151')] [2024-08-05 08:08:53,443][00139] DAMAGECOUNT value on done: 131026.0 [2024-08-05 08:08:53,444][00139] Sum rewards: -1.293, reward structure: {'DEATHCOUNT': '-9.000', 'HEALTH': '-1.672', 'AMMO2': '0.004', 'AMMO5': '0.015', 'AMMO4': '0.019', 'WEAPON1': '0.020', 'ARMOR': '0.020', 'weapon4': '0.100', 'weapon5': '0.100', 'WEAPON4': '0.150', 'AMMO3': '0.160', 'WEAPON5': '0.250', 'HITCOUNT': '0.310', 'WEAPON3': '0.800', 'DAMAGECOUNT': '1.065', 'weapon2': '1.550', 'weapon3': '1.816', 'FRAGCOUNT': '3.000'} [2024-08-05 08:08:53,733][00139] DAMAGECOUNT value on done: 141553.0 [2024-08-05 08:08:53,734][00139] Sum rewards: 0.431, reward structure: {'DEATHCOUNT': '-8.250', 'HEALTH': '-1.237', 'AMMO2': '0.002', 'AMMO4': '0.009', 'AMMO5': '0.012', 'WEAPON1': '0.020', 'ARMOR': '0.028', 'weapon5': '0.060', 'weapon7': '0.106', 'AMMO3': '0.115', 'WEAPON5': '0.200', 'HITCOUNT': '0.290', 'AMMO6': '0.300', 'WEAPON7': '0.300', 'AMMO7': '0.300', 'WEAPON3': '0.750', 'weapon2': '1.264', 'DAMAGECOUNT': '1.470', 'weapon3': '2.192', 'FRAGCOUNT': '2.500'} [2024-08-05 08:08:55,483][00034] Fps is (10 sec: 1638.4, 60 sec: 1092.3, 300 sec: 1138.5). Total num frames: 9928704. Throughput: 0: 281.7. Samples: 2482601. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 08:08:55,484][00034] Avg episode reward: [(0, '-2.116')] [2024-08-05 08:09:00,483][00034] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1110.8). Total num frames: 9928704. Throughput: 0: 282.2. Samples: 2483473. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 08:09:00,485][00034] Avg episode reward: [(0, '-2.116')] [2024-08-05 08:09:04,300][00139] Large shaping reward 2.632 for [('FRAGCOUNT', 2.0, 2.0), ('HITCOUNT', 0.03, 3.0), ('DAMAGECOUNT', 0.6, 200), ('weapon7', 0.002)] [2024-08-05 08:09:05,483][00034] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1110.8). Total num frames: 9936896. Throughput: 0: 281.2. Samples: 2485119. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 08:09:05,485][00034] Avg episode reward: [(0, '-2.116')] [2024-08-05 08:09:08,433][00139] DAMAGECOUNT value on done: 131690.0 [2024-08-05 08:09:08,433][00139] Sum rewards: -3.474, reward structure: {'DEATHCOUNT': '-13.500', 'HEALTH': '-2.092', 'AMMO2': '0.011', 'AMMO5': '0.013', 'ARMOR': '0.036', 'AMMO4': '0.054', 'weapon7': '0.076', 'AMMO6': '0.120', 'AMMO7': '0.120', 'weapon5': '0.168', 'AMMO3': '0.191', 'WEAPON5': '0.200', 'WEAPON7': '0.200', 'weapon4': '0.220', 'WEAPON4': '0.250', 'HITCOUNT': '0.320', 'weapon2': '1.014', 'WEAPON3': '1.100', 'DAMAGECOUNT': '1.917', 'weapon3': '2.108', 'FRAGCOUNT': '4.000'} [2024-08-05 08:09:08,667][00139] DAMAGECOUNT value on done: 141982.0 [2024-08-05 08:09:08,667][00139] Sum rewards: -5.057, reward structure: {'DEATHCOUNT': '-9.750', 'HEALTH': '-1.172', 'FRAGCOUNT': '-0.500', 'AMMO2': '0.011', 'WEAPON1': '0.020', 'AMMO5': '0.026', 'AMMO4': '0.057', 'AMMO3': '0.117', 'HITCOUNT': '0.130', 'weapon5': '0.218', 'WEAPON3': '0.550', 'WEAPON5': '0.600', 'DAMAGECOUNT': '1.287', 'weapon3': '1.342', 'weapon2': '2.006'} [2024-08-05 08:09:10,483][00034] Fps is (10 sec: 1638.4, 60 sec: 1092.3, 300 sec: 1138.5). Total num frames: 9945088. Throughput: 0: 279.4. Samples: 2486796. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 08:09:10,484][00034] Avg episode reward: [(0, '-2.195')] [2024-08-05 08:09:15,483][00034] Fps is (10 sec: 1638.4, 60 sec: 1228.8, 300 sec: 1138.5). Total num frames: 9953280. Throughput: 0: 281.0. Samples: 2487677. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 08:09:15,484][00034] Avg episode reward: [(0, '-2.195')] [2024-08-05 08:09:20,483][00034] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1110.8). Total num frames: 9953280. Throughput: 0: 282.2. Samples: 2489392. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 08:09:20,485][00034] Avg episode reward: [(0, '-2.195')] [2024-08-05 08:09:23,126][00139] DAMAGECOUNT value on done: 132125.0 [2024-08-05 08:09:23,126][00139] Sum rewards: -4.838, reward structure: {'DEATHCOUNT': '-10.500', 'HEALTH': '-1.100', 'FRAGCOUNT': '0.000', 'AMMO2': '0.001', 'AMMO4': '0.005', 'AMMO5': '0.025', 'weapon5': '0.114', 'AMMO3': '0.160', 'HITCOUNT': '0.380', 'WEAPON5': '0.450', 'WEAPON3': '0.750', 'DAMAGECOUNT': '1.305', 'weapon2': '1.656', 'weapon3': '1.916'} [2024-08-05 08:09:23,370][00139] DAMAGECOUNT value on done: 142137.0 [2024-08-05 08:09:23,371][00139] Sum rewards: -5.237, reward structure: {'DEATHCOUNT': '-10.500', 'HEALTH': '-2.000', 'AMMO4': '-0.029', 'AMMO2': '-0.006', 'weapon5': '0.010', 'AMMO5': '0.015', 'WEAPON1': '0.020', 'WEAPON4': '0.050', 'HITCOUNT': '0.140', 'AMMO3': '0.168', 'WEAPON5': '0.200', 'DAMAGECOUNT': '0.465', 'ARMOR': '0.557', 'WEAPON3': '0.950', 'FRAGCOUNT': '1.000', 'weapon3': '1.832', 'weapon2': '1.890'} [2024-08-05 08:09:25,483][00034] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1138.5). Total num frames: 9961472. Throughput: 0: 282.5. Samples: 2491099. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 08:09:25,484][00034] Avg episode reward: [(0, '-2.230')] [2024-08-05 08:09:30,483][00034] Fps is (10 sec: 1638.4, 60 sec: 1228.8, 300 sec: 1138.5). Total num frames: 9969664. Throughput: 0: 282.4. Samples: 2491938. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 08:09:30,485][00034] Avg episode reward: [(0, '-2.230')] [2024-08-05 08:09:35,483][00034] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1110.8). Total num frames: 9969664. Throughput: 0: 282.2. Samples: 2493615. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 08:09:35,485][00034] Avg episode reward: [(0, '-2.230')] [2024-08-05 08:09:35,495][00132] Saving /kaggle/working/train_dir/default_experiment/checkpoint_p0/checkpoint_000001217_9969664.pth... [2024-08-05 08:09:35,572][00132] Removing /kaggle/working/train_dir/default_experiment/checkpoint_p0/checkpoint_000001185_9707520.pth [2024-08-05 08:09:38,391][00139] DAMAGECOUNT value on done: 132512.0 [2024-08-05 08:09:38,391][00139] Sum rewards: -1.921, reward structure: {'DEATHCOUNT': '-11.250', 'HEALTH': '-1.238', 'weapon4': '0.002', 'AMMO2': '0.012', 'WEAPON4': '0.050', 'AMMO4': '0.057', 'ARMOR': '0.064', 'AMMO3': '0.165', 'HITCOUNT': '0.340', 'WEAPON3': '0.950', 'DAMAGECOUNT': '1.161', 'weapon2': '1.422', 'weapon3': '2.344', 'FRAGCOUNT': '4.000'} [2024-08-05 08:09:38,629][00139] DAMAGECOUNT value on done: 142538.0 [2024-08-05 08:09:38,629][00139] Sum rewards: -3.899, reward structure: {'DEATHCOUNT': '-12.750', 'HEALTH': '-0.970', 'AMMO5': '0.011', 'ARMOR': '0.024', 'AMMO2': '0.030', 'weapon5': '0.114', 'AMMO4': '0.147', 'WEAPON4': '0.150', 'weapon4': '0.176', 'AMMO3': '0.178', 'HITCOUNT': '0.220', 'WEAPON5': '0.250', 'WEAPON3': '1.000', 'DAMAGECOUNT': '1.203', 'weapon2': '1.250', 'weapon3': '2.068', 'FRAGCOUNT': '3.000'} [2024-08-05 08:09:40,483][00034] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1110.8). Total num frames: 9977856. Throughput: 0: 280.3. Samples: 2495215. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 08:09:40,485][00034] Avg episode reward: [(0, '-2.262')] [2024-08-05 08:09:45,483][00034] Fps is (10 sec: 1638.4, 60 sec: 1228.8, 300 sec: 1138.5). Total num frames: 9986048. Throughput: 0: 280.3. Samples: 2496087. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 08:09:45,485][00034] Avg episode reward: [(0, '-2.262')] [2024-08-05 08:09:50,483][00034] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1110.8). Total num frames: 9986048. Throughput: 0: 281.2. Samples: 2497771. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 08:09:50,484][00034] Avg episode reward: [(0, '-2.262')] [2024-08-05 08:09:51,830][00138] Updated weights for policy 0, policy_version 1220 (0.0017) [2024-08-05 08:09:53,150][00139] DAMAGECOUNT value on done: 132695.0 [2024-08-05 08:09:53,150][00139] Sum rewards: -0.657, reward structure: {'DEATHCOUNT': '-7.500', 'HEALTH': '-1.336', 'AMMO4': '-0.016', 'AMMO2': '-0.003', 'WEAPON1': '0.010', 'ARMOR': '0.102', 'AMMO3': '0.143', 'HITCOUNT': '0.200', 'DAMAGECOUNT': '0.549', 'WEAPON3': '0.750', 'weapon2': '1.654', 'weapon3': '1.790', 'FRAGCOUNT': '3.000'} [2024-08-05 08:09:53,390][00139] DAMAGECOUNT value on done: 142698.0 [2024-08-05 08:09:53,391][00139] Sum rewards: -3.521, reward structure: {'DEATHCOUNT': '-9.000', 'HEALTH': '-1.326', 'AMMO2': '0.002', 'AMMO5': '0.010', 'weapon5': '0.012', 'AMMO4': '0.012', 'ARMOR': '0.032', 'WEAPON4': '0.050', 'weapon7': '0.060', 'AMMO3': '0.116', 'AMMO6': '0.120', 'AMMO7': '0.120', 'weapon4': '0.140', 'HITCOUNT': '0.150', 'WEAPON5': '0.200', 'WEAPON7': '0.200', 'DAMAGECOUNT': '0.480', 'WEAPON3': '0.700', 'FRAGCOUNT': '1.000', 'weapon3': '1.476', 'weapon2': '1.924'} [2024-08-05 08:09:55,483][00034] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1110.8). Total num frames: 9994240. Throughput: 0: 282.4. Samples: 2499504. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 08:09:55,484][00034] Avg episode reward: [(0, '-2.250')] [2024-08-05 08:10:00,483][00034] Fps is (10 sec: 1638.4, 60 sec: 1228.8, 300 sec: 1138.5). Total num frames: 10002432. Throughput: 0: 282.1. Samples: 2500370. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 08:10:00,484][00034] Avg episode reward: [(0, '-2.250')] [2024-08-05 08:10:05,483][00034] Fps is (10 sec: 819.2, 60 sec: 1092.3, 300 sec: 1110.8). Total num frames: 10002432. Throughput: 0: 280.2. Samples: 2502003. Policy #0 lag: (min: 0.0, avg: 0.0, max: 1.0) [2024-08-05 08:10:05,485][00034] Avg episode reward: [(0, '-2.250')] [2024-08-05 08:10:06,544][00132] Stopping Batcher_0... [2024-08-05 08:10:06,544][00132] Loop batcher_evt_loop terminating... [2024-08-05 08:10:06,545][00132] Saving /kaggle/working/train_dir/default_experiment/checkpoint_p0/checkpoint_000001222_10010624.pth... [2024-08-05 08:10:06,553][00139] Stopping RolloutWorker_w0... [2024-08-05 08:10:06,554][00139] Loop rollout_proc0_evt_loop terminating... [2024-08-05 08:10:06,551][00034] Component Batcher_0 stopped! [2024-08-05 08:10:06,556][00034] Component RolloutWorker_w0 stopped! [2024-08-05 08:10:06,571][00138] Weights refcount: 2 0 [2024-08-05 08:10:06,573][00138] Stopping InferenceWorker_p0-w0... [2024-08-05 08:10:06,573][00138] Loop inference_proc0-0_evt_loop terminating... [2024-08-05 08:10:06,573][00034] Component InferenceWorker_p0-w0 stopped! [2024-08-05 08:10:06,620][00132] Removing /kaggle/working/train_dir/default_experiment/checkpoint_p0/checkpoint_000001201_9838592.pth [2024-08-05 08:10:06,628][00132] Saving /kaggle/working/train_dir/default_experiment/checkpoint_p0/checkpoint_000001222_10010624.pth... [2024-08-05 08:10:06,732][00132] Stopping LearnerWorker_p0... [2024-08-05 08:10:06,733][00132] Loop learner_proc0_evt_loop terminating... [2024-08-05 08:10:06,733][00034] Component LearnerWorker_p0 stopped! [2024-08-05 08:10:06,735][00034] Waiting for process learner_proc0 to stop... [2024-08-05 08:10:07,534][00034] Waiting for process inference_proc0-0 to join... [2024-08-05 08:10:07,537][00034] Waiting for process rollout_proc0 to join... [2024-08-05 08:10:07,539][00034] Batcher 0 profile tree view: batching: 26.6902, releasing_batches: 0.0368 [2024-08-05 08:10:07,540][00034] InferenceWorker_p0-w0 profile tree view: wait_policy: 0.0000 wait_policy_total: 61.1040 update_model: 59.5236 weight_update: 0.0018 one_step: 0.0056 handle_policy_step: 8337.4966 deserialize: 112.1406, stack: 35.0570, obs_to_device_normalize: 1542.7320, forward: 5718.4948, send_messages: 136.3395 prepare_outputs: 516.8943 to_cpu: 287.3118 [2024-08-05 08:10:07,541][00034] Learner 0 profile tree view: misc: 0.0069, prepare_batch: 13.5349 train: 73.2858 epoch_init: 0.0071, minibatch_init: 0.0068, losses_postprocess: 0.3579, kl_divergence: 1.0528, after_optimizer: 39.3503 calculate_losses: 17.9133 losses_init: 0.0046, forward_head: 1.0948, bptt_initial: 11.0976, tail: 1.5028, advantages_returns: 0.1520, losses: 2.4529 bptt: 1.4099 bptt_forward_core: 1.3532 update: 14.1182 clip: 0.8819 [2024-08-05 08:10:07,542][00034] RolloutWorker_w0 profile tree view: wait_for_trajectories: 4.3697, enqueue_policy_requests: 193.8860, env_step: 4870.6171, overhead: 108.2441, complete_rollouts: 8.6614 save_policy_outputs: 221.7255 split_output_tensors: 84.0771 [2024-08-05 08:10:07,544][00034] Loop Runner_EvtLoop terminating... [2024-08-05 08:10:07,545][00034] Runner profile tree view: main_loop: 8786.7048 [2024-08-05 08:10:07,546][00034] Collected {0: 10010624}, FPS: 1139.3 [2024-08-05 08:11:34,430][00034] Loading existing experiment configuration from /kaggle/working/train_dir/default_experiment/config.json [2024-08-05 08:11:34,432][00034] Adding new argument 'no_render'=True that is not in the saved config file! [2024-08-05 08:11:34,433][00034] Adding new argument 'save_video'=True that is not in the saved config file! [2024-08-05 08:11:34,434][00034] Adding new argument 'video_frames'=1000000000.0 that is not in the saved config file! [2024-08-05 08:11:34,435][00034] Adding new argument 'video_name'=None that is not in the saved config file! [2024-08-05 08:11:34,435][00034] Adding new argument 'max_num_frames'=1000000000.0 that is not in the saved config file! [2024-08-05 08:11:34,436][00034] Adding new argument 'max_num_episodes'=4 that is not in the saved config file! [2024-08-05 08:11:34,437][00034] Adding new argument 'push_to_hub'=False that is not in the saved config file! [2024-08-05 08:11:34,438][00034] Adding new argument 'hf_repository'=None that is not in the saved config file! [2024-08-05 08:11:34,439][00034] Adding new argument 'policy_index'=0 that is not in the saved config file! [2024-08-05 08:11:34,440][00034] Adding new argument 'eval_deterministic'=False that is not in the saved config file! [2024-08-05 08:11:34,441][00034] Adding new argument 'train_script'=None that is not in the saved config file! [2024-08-05 08:11:34,442][00034] Adding new argument 'enjoy_script'=None that is not in the saved config file! [2024-08-05 08:11:34,443][00034] Using frameskip 1 and render_action_repeat=4 for evaluation [2024-08-05 08:11:34,475][00034] Doom resolution: 160x120, resize resolution: (128, 72) [2024-08-05 08:11:34,479][00034] Port 40300 is available [2024-08-05 08:11:34,480][00034] Using port 40300 [2024-08-05 08:11:34,482][00034] RunningMeanStd input shape: (23,) [2024-08-05 08:11:34,484][00034] RunningMeanStd input shape: (3, 72, 128) [2024-08-05 08:11:34,486][00034] RunningMeanStd input shape: (1,) [2024-08-05 08:11:34,505][00034] ConvEncoder: input_channels=3 [2024-08-05 08:11:34,658][00034] Conv encoder output size: 512 [2024-08-05 08:11:34,660][00034] Policy head output size: 640 [2024-08-05 08:11:34,863][00034] Loading state from checkpoint /kaggle/working/train_dir/default_experiment/checkpoint_p0/checkpoint_000001222_10010624.pth... [2024-08-05 08:11:34,907][00034] Using port 40300 on host... [2024-08-05 08:11:35,232][00034] Initialized w:0 v:0 player:0 [2024-08-05 08:11:35,762][00034] Num frames 100... [2024-08-05 08:11:35,986][00034] Num frames 200... [2024-08-05 08:11:36,212][00034] Num frames 300... [2024-08-05 08:11:36,432][00034] Num frames 400... [2024-08-05 08:11:36,655][00034] Num frames 500... [2024-08-05 08:11:36,881][00034] Num frames 600... [2024-08-05 08:11:37,109][00034] Num frames 700... [2024-08-05 08:11:37,336][00034] Num frames 800... [2024-08-05 08:11:37,562][00034] Num frames 900... [2024-08-05 08:11:37,791][00034] Num frames 1000... [2024-08-05 08:11:38,018][00034] Num frames 1100... [2024-08-05 08:11:38,250][00034] Num frames 1200... [2024-08-05 08:11:38,479][00034] Num frames 1300... [2024-08-05 08:11:38,713][00034] Num frames 1400... [2024-08-05 08:11:38,964][00034] Num frames 1500... [2024-08-05 08:11:39,187][00034] Num frames 1600... [2024-08-05 08:11:39,412][00034] Num frames 1700... [2024-08-05 08:11:39,635][00034] Num frames 1800... [2024-08-05 08:11:39,865][00034] Num frames 1900... [2024-08-05 08:11:40,087][00034] Num frames 2000... [2024-08-05 08:11:40,315][00034] Num frames 2100... [2024-08-05 08:11:40,541][00034] Num frames 2200... [2024-08-05 08:11:40,774][00034] Num frames 2300... [2024-08-05 08:11:40,998][00034] Num frames 2400... [2024-08-05 08:11:41,224][00034] Num frames 2500... [2024-08-05 08:11:41,480][00034] Num frames 2600... [2024-08-05 08:11:41,759][00034] Num frames 2700... [2024-08-05 08:11:42,001][00034] Num frames 2800... [2024-08-05 08:11:42,240][00034] Num frames 2900... [2024-08-05 08:11:42,503][00034] Num frames 3000... [2024-08-05 08:11:42,729][00034] Num frames 3100... [2024-08-05 08:11:42,955][00034] Num frames 3200... [2024-08-05 08:11:43,173][00034] Num frames 3300... [2024-08-05 08:11:43,391][00034] Num frames 3400... [2024-08-05 08:11:43,612][00034] Num frames 3500... [2024-08-05 08:11:43,828][00034] Num frames 3600... [2024-08-05 08:11:44,048][00034] Num frames 3700... [2024-08-05 08:11:44,262][00034] Num frames 3800... [2024-08-05 08:11:44,479][00034] Num frames 3900... [2024-08-05 08:11:44,701][00034] Num frames 4000... [2024-08-05 08:11:44,928][00034] Num frames 4100... [2024-08-05 08:11:45,147][00034] Num frames 4200... [2024-08-05 08:11:45,370][00034] Num frames 4300... [2024-08-05 08:11:45,587][00034] Num frames 4400... [2024-08-05 08:11:45,812][00034] Num frames 4500... [2024-08-05 08:11:46,038][00034] Num frames 4600... [2024-08-05 08:11:46,262][00034] Num frames 4700... [2024-08-05 08:11:46,489][00034] Num frames 4800... [2024-08-05 08:11:46,723][00034] Num frames 4900... [2024-08-05 08:11:46,944][00034] Num frames 5000... [2024-08-05 08:11:47,167][00034] Num frames 5100... [2024-08-05 08:11:47,395][00034] Num frames 5200... [2024-08-05 08:11:47,618][00034] Num frames 5300... [2024-08-05 08:11:47,839][00034] Num frames 5400... [2024-08-05 08:11:48,075][00034] Num frames 5500... [2024-08-05 08:11:48,310][00034] Num frames 5600... [2024-08-05 08:11:48,534][00034] Num frames 5700... [2024-08-05 08:11:48,761][00034] Num frames 5800... [2024-08-05 08:11:49,003][00034] Num frames 5900... [2024-08-05 08:11:49,223][00034] Num frames 6000... [2024-08-05 08:11:49,444][00034] Num frames 6100... [2024-08-05 08:11:49,667][00034] Num frames 6200... [2024-08-05 08:11:49,897][00034] Num frames 6300... [2024-08-05 08:11:50,127][00034] Num frames 6400... [2024-08-05 08:11:50,355][00034] Num frames 6500... [2024-08-05 08:11:50,577][00034] Num frames 6600... [2024-08-05 08:11:50,803][00034] Num frames 6700... [2024-08-05 08:11:51,037][00034] Num frames 6800... [2024-08-05 08:11:51,255][00034] Num frames 6900... [2024-08-05 08:11:51,484][00034] Num frames 7000... [2024-08-05 08:11:51,724][00034] Num frames 7100... [2024-08-05 08:11:51,944][00034] Num frames 7200... [2024-08-05 08:11:52,174][00034] Num frames 7300... [2024-08-05 08:11:52,415][00034] Num frames 7400... [2024-08-05 08:11:52,633][00034] Num frames 7500... [2024-08-05 08:11:52,858][00034] Num frames 7600... [2024-08-05 08:11:53,090][00034] Num frames 7700... [2024-08-05 08:11:53,315][00034] Num frames 7800... [2024-08-05 08:11:53,541][00034] Num frames 7900... [2024-08-05 08:11:53,775][00034] Num frames 8000... [2024-08-05 08:11:54,005][00034] Num frames 8100... [2024-08-05 08:11:54,240][00034] Num frames 8200... [2024-08-05 08:11:54,471][00034] Num frames 8300... [2024-08-05 08:11:54,695][00034] DAMAGECOUNT value on done: 133.0 [2024-08-05 08:11:54,696][00034] Sum rewards: 5.257, reward structure: {'DEATHCOUNT': '-9.000', 'HEALTH': '-3.755', 'AMMO2': '0.001', 'AMMO4': '0.007', 'AMMO5': '0.010', 'WEAPON1': '0.020', 'ARMOR': '0.048', 'weapon5': '0.072', 'HITCOUNT': '0.090', 'WEAPON5': '0.100', 'AMMO3': '0.173', 'WEAPON4': '0.200', 'weapon4': '0.236', 'DAMAGECOUNT': '0.399', 'WEAPON3': '1.100', 'FRAGCOUNT': '2.000', 'weapon2': '6.264', 'weapon3': '7.292'} [2024-08-05 08:11:54,759][00034] Avg episode rewards: #0: 5.257, true rewards: #0: 2.000 [2024-08-05 08:11:54,763][00034] Avg episode reward: 5.257, avg true_objective: 2.000 [2024-08-05 08:11:54,769][00034] Num frames 8400... [2024-08-05 08:11:55,003][00034] Num frames 8500... [2024-08-05 08:11:55,244][00034] Num frames 8600... [2024-08-05 08:11:55,484][00034] Num frames 8700... [2024-08-05 08:11:55,714][00034] Num frames 8800... [2024-08-05 08:11:55,944][00034] Num frames 8900... [2024-08-05 08:11:56,172][00034] Num frames 9000... [2024-08-05 08:11:56,393][00034] Num frames 9100... [2024-08-05 08:11:56,624][00034] Num frames 9200... [2024-08-05 08:11:56,855][00034] Num frames 9300... [2024-08-05 08:11:57,078][00034] Num frames 9400... [2024-08-05 08:11:57,301][00034] Num frames 9500... [2024-08-05 08:11:57,527][00034] Num frames 9600... [2024-08-05 08:11:57,753][00034] Num frames 9700... [2024-08-05 08:11:57,975][00034] Num frames 9800... [2024-08-05 08:11:58,199][00034] Num frames 9900... [2024-08-05 08:11:58,426][00034] Num frames 10000... [2024-08-05 08:11:58,651][00034] Num frames 10100... [2024-08-05 08:11:58,902][00034] Num frames 10200... [2024-08-05 08:11:59,124][00034] Num frames 10300... [2024-08-05 08:11:59,344][00034] Num frames 10400... [2024-08-05 08:11:59,565][00034] Num frames 10500... [2024-08-05 08:11:59,788][00034] Num frames 10600... [2024-08-05 08:12:00,011][00034] Num frames 10700... [2024-08-05 08:12:00,236][00034] Num frames 10800... [2024-08-05 08:12:00,466][00034] Num frames 10900... [2024-08-05 08:12:00,690][00034] Num frames 11000... [2024-08-05 08:12:00,913][00034] Num frames 11100... [2024-08-05 08:12:01,136][00034] Num frames 11200... [2024-08-05 08:12:01,358][00034] Num frames 11300... [2024-08-05 08:12:01,578][00034] Num frames 11400... [2024-08-05 08:12:01,811][00034] Num frames 11500... [2024-08-05 08:12:02,041][00034] Num frames 11600... [2024-08-05 08:12:02,275][00034] Num frames 11700... [2024-08-05 08:12:02,508][00034] Num frames 11800... [2024-08-05 08:12:02,736][00034] Num frames 11900... [2024-08-05 08:12:02,962][00034] Num frames 12000... [2024-08-05 08:12:03,188][00034] Num frames 12100... [2024-08-05 08:12:03,417][00034] Num frames 12200... [2024-08-05 08:12:03,642][00034] Num frames 12300... [2024-08-05 08:12:03,869][00034] Num frames 12400... [2024-08-05 08:12:04,091][00034] Num frames 12500... [2024-08-05 08:12:04,311][00034] Num frames 12600... [2024-08-05 08:12:04,532][00034] Num frames 12700... [2024-08-05 08:12:04,751][00034] Num frames 12800... [2024-08-05 08:12:04,972][00034] Num frames 12900... [2024-08-05 08:12:05,198][00034] Num frames 13000... [2024-08-05 08:12:05,421][00034] Num frames 13100... [2024-08-05 08:12:05,647][00034] Num frames 13200... [2024-08-05 08:12:05,874][00034] Num frames 13300... [2024-08-05 08:12:06,105][00034] Num frames 13400... [2024-08-05 08:12:06,333][00034] Num frames 13500... [2024-08-05 08:12:06,559][00034] Num frames 13600... [2024-08-05 08:12:06,788][00034] Num frames 13700... [2024-08-05 08:12:07,019][00034] Num frames 13800... [2024-08-05 08:12:07,266][00034] Num frames 13900... [2024-08-05 08:12:07,495][00034] Num frames 14000... [2024-08-05 08:12:07,726][00034] Num frames 14100... [2024-08-05 08:12:07,962][00034] Num frames 14200... [2024-08-05 08:12:08,209][00034] Num frames 14300... [2024-08-05 08:12:08,446][00034] Num frames 14400... [2024-08-05 08:12:08,686][00034] Num frames 14500... [2024-08-05 08:12:08,953][00034] Num frames 14600... [2024-08-05 08:12:09,193][00034] Num frames 14700... [2024-08-05 08:12:09,450][00034] Num frames 14800... [2024-08-05 08:12:09,689][00034] Num frames 14900... [2024-08-05 08:12:09,929][00034] Num frames 15000... [2024-08-05 08:12:10,160][00034] Num frames 15100... [2024-08-05 08:12:10,385][00034] Num frames 15200... [2024-08-05 08:12:10,611][00034] Num frames 15300... [2024-08-05 08:12:10,845][00034] Num frames 15400... [2024-08-05 08:12:11,070][00034] Num frames 15500... [2024-08-05 08:12:11,296][00034] Num frames 15600... [2024-08-05 08:12:11,523][00034] Num frames 15700... [2024-08-05 08:12:11,751][00034] Num frames 15800... [2024-08-05 08:12:11,977][00034] Num frames 15900... [2024-08-05 08:12:12,207][00034] Num frames 16000... [2024-08-05 08:12:12,430][00034] Num frames 16100... [2024-08-05 08:12:12,652][00034] Num frames 16200... [2024-08-05 08:12:12,900][00034] Num frames 16300... [2024-08-05 08:12:13,183][00034] Num frames 16400... [2024-08-05 08:12:13,442][00034] Num frames 16500... [2024-08-05 08:12:13,674][00034] Num frames 16600... [2024-08-05 08:12:13,927][00034] Num frames 16700... [2024-08-05 08:12:14,152][00034] DAMAGECOUNT value on done: 487.0 [2024-08-05 08:12:14,215][00034] Avg episode rewards: #0: 6.359, true rewards: #0: 1.000 [2024-08-05 08:12:14,217][00034] Avg episode reward: 6.359, avg true_objective: 1.000 [2024-08-05 08:12:14,224][00034] Num frames 16800... [2024-08-05 08:12:14,461][00034] Num frames 16900... [2024-08-05 08:12:14,694][00034] Num frames 17000... [2024-08-05 08:12:14,925][00034] Num frames 17100... [2024-08-05 08:12:15,151][00034] Num frames 17200... [2024-08-05 08:12:15,370][00034] Num frames 17300... [2024-08-05 08:12:15,595][00034] Num frames 17400... [2024-08-05 08:12:15,813][00034] Num frames 17500... [2024-08-05 08:12:16,046][00034] Num frames 17600... [2024-08-05 08:12:16,268][00034] Num frames 17700... [2024-08-05 08:12:16,494][00034] Num frames 17800... [2024-08-05 08:12:16,719][00034] Num frames 17900... [2024-08-05 08:12:16,944][00034] Num frames 18000... [2024-08-05 08:12:17,186][00034] Num frames 18100... [2024-08-05 08:12:17,425][00034] Num frames 18200... [2024-08-05 08:12:17,656][00034] Num frames 18300... [2024-08-05 08:12:17,894][00034] Num frames 18400... [2024-08-05 08:12:18,137][00034] Num frames 18500... [2024-08-05 08:12:18,378][00034] Num frames 18600... [2024-08-05 08:12:18,608][00034] Num frames 18700... [2024-08-05 08:12:18,857][00034] Num frames 18800... [2024-08-05 08:12:19,087][00034] Num frames 18900... [2024-08-05 08:12:19,308][00034] Num frames 19000... [2024-08-05 08:12:19,533][00034] Num frames 19100... [2024-08-05 08:12:19,779][00034] Num frames 19200... [2024-08-05 08:12:20,008][00034] Num frames 19300... [2024-08-05 08:12:20,229][00034] Num frames 19400... [2024-08-05 08:12:20,461][00034] Num frames 19500... [2024-08-05 08:12:20,683][00034] Num frames 19600... [2024-08-05 08:12:20,917][00034] Num frames 19700... [2024-08-05 08:12:21,147][00034] Num frames 19800... [2024-08-05 08:12:21,371][00034] Num frames 19900... [2024-08-05 08:12:21,607][00034] Num frames 20000... [2024-08-05 08:12:21,835][00034] Num frames 20100... [2024-08-05 08:12:22,065][00034] Num frames 20200... [2024-08-05 08:12:22,297][00034] Num frames 20300... [2024-08-05 08:12:22,535][00034] Num frames 20400... [2024-08-05 08:12:22,767][00034] Num frames 20500... [2024-08-05 08:12:22,995][00034] Num frames 20600... [2024-08-05 08:12:23,218][00034] Num frames 20700... [2024-08-05 08:12:23,439][00034] Num frames 20800... [2024-08-05 08:12:23,662][00034] Num frames 20900... [2024-08-05 08:12:23,885][00034] Num frames 21000... [2024-08-05 08:12:24,110][00034] Num frames 21100... [2024-08-05 08:12:24,331][00034] Num frames 21200... [2024-08-05 08:12:24,557][00034] Num frames 21300... [2024-08-05 08:12:24,787][00034] Num frames 21400... [2024-08-05 08:12:25,014][00034] Num frames 21500... [2024-08-05 08:12:25,243][00034] Num frames 21600... [2024-08-05 08:12:25,494][00034] Num frames 21700... [2024-08-05 08:12:25,740][00034] Num frames 21800... [2024-08-05 08:12:25,984][00034] Num frames 21900... [2024-08-05 08:12:26,217][00034] Num frames 22000... [2024-08-05 08:12:26,442][00034] Num frames 22100... [2024-08-05 08:12:26,677][00034] Num frames 22200... [2024-08-05 08:12:26,913][00034] Num frames 22300... [2024-08-05 08:12:27,146][00034] Num frames 22400... [2024-08-05 08:12:27,371][00034] Num frames 22500... [2024-08-05 08:12:27,598][00034] Num frames 22600... [2024-08-05 08:12:27,830][00034] Num frames 22700... [2024-08-05 08:12:28,059][00034] Num frames 22800... [2024-08-05 08:12:28,276][00034] Num frames 22900... [2024-08-05 08:12:28,494][00034] Num frames 23000... [2024-08-05 08:12:28,709][00034] Num frames 23100... [2024-08-05 08:12:28,950][00034] Num frames 23200... [2024-08-05 08:12:29,159][00034] Num frames 23300... [2024-08-05 08:12:29,373][00034] Num frames 23400... [2024-08-05 08:12:29,583][00034] Num frames 23500... [2024-08-05 08:12:29,792][00034] Num frames 23600... [2024-08-05 08:12:30,006][00034] Num frames 23700... [2024-08-05 08:12:30,217][00034] Num frames 23800... [2024-08-05 08:12:30,429][00034] Num frames 23900... [2024-08-05 08:12:30,643][00034] Num frames 24000... [2024-08-05 08:12:30,850][00034] Num frames 24100... [2024-08-05 08:12:31,062][00034] Num frames 24200... [2024-08-05 08:12:31,273][00034] Num frames 24300... [2024-08-05 08:12:31,486][00034] Num frames 24400... [2024-08-05 08:12:31,697][00034] Num frames 24500... [2024-08-05 08:12:31,909][00034] Num frames 24600... [2024-08-05 08:12:32,118][00034] Num frames 24700... [2024-08-05 08:12:32,328][00034] Num frames 24800... [2024-08-05 08:12:32,538][00034] Num frames 24900... [2024-08-05 08:12:32,774][00034] Num frames 25000... [2024-08-05 08:12:32,998][00034] Num frames 25100... [2024-08-05 08:12:33,211][00034] DAMAGECOUNT value on done: 732.0 [2024-08-05 08:12:33,212][00034] Sum rewards: 7.003, reward structure: {'DEATHCOUNT': '-9.750', 'HEALTH': '-3.820', 'AMMO4': '-0.027', 'AMMO2': '-0.005', 'AMMO5': '0.005', 'weapon5': '0.024', 'ARMOR': '0.040', 'WEAPON5': '0.100', 'AMMO3': '0.157', 'WEAPON4': '0.200', 'HITCOUNT': '0.200', 'weapon7': '0.216', 'AMMO6': '0.320', 'AMMO7': '0.320', 'WEAPON7': '0.400', 'DAMAGECOUNT': '0.735', 'weapon4': '0.960', 'FRAGCOUNT': '1.000', 'WEAPON3': '1.100', 'weapon2': '6.052', 'weapon3': '8.776'} [2024-08-05 08:12:33,275][00034] Avg episode rewards: #0: 6.573, true rewards: #0: 1.000 [2024-08-05 08:12:33,275][00034] Avg episode reward: 6.573, avg true_objective: 1.000 [2024-08-05 08:12:33,284][00034] Num frames 25200... [2024-08-05 08:12:33,496][00034] Num frames 25300... [2024-08-05 08:12:33,715][00034] Num frames 25400... [2024-08-05 08:12:33,934][00034] Num frames 25500... [2024-08-05 08:12:34,151][00034] Num frames 25600... [2024-08-05 08:12:34,374][00034] Num frames 25700... [2024-08-05 08:12:34,594][00034] Num frames 25800... [2024-08-05 08:12:34,823][00034] Num frames 25900... [2024-08-05 08:12:35,054][00034] Num frames 26000... [2024-08-05 08:12:35,280][00034] Num frames 26100... [2024-08-05 08:12:35,494][00034] Num frames 26200... [2024-08-05 08:12:35,711][00034] Num frames 26300... [2024-08-05 08:12:35,936][00034] Num frames 26400... [2024-08-05 08:12:36,147][00034] Num frames 26500... [2024-08-05 08:12:36,359][00034] Num frames 26600... [2024-08-05 08:12:36,574][00034] Num frames 26700... [2024-08-05 08:12:36,797][00034] Num frames 26800... [2024-08-05 08:12:37,019][00034] Num frames 26900... [2024-08-05 08:12:37,239][00034] Num frames 27000... [2024-08-05 08:12:37,461][00034] Num frames 27100... [2024-08-05 08:12:37,682][00034] Num frames 27200... [2024-08-05 08:12:37,911][00034] Num frames 27300... [2024-08-05 08:12:38,127][00034] Num frames 27400... [2024-08-05 08:12:38,342][00034] Num frames 27500... [2024-08-05 08:12:38,560][00034] Num frames 27600... [2024-08-05 08:12:38,797][00034] Num frames 27700... [2024-08-05 08:12:39,035][00034] Num frames 27800... [2024-08-05 08:12:39,248][00034] Num frames 27900... [2024-08-05 08:12:39,467][00034] Num frames 28000... [2024-08-05 08:12:39,681][00034] Num frames 28100... [2024-08-05 08:12:39,904][00034] Num frames 28200... [2024-08-05 08:12:40,132][00034] Num frames 28300... [2024-08-05 08:12:40,348][00034] Num frames 28400... [2024-08-05 08:12:40,568][00034] Num frames 28500... [2024-08-05 08:12:40,790][00034] Num frames 28600... [2024-08-05 08:12:41,006][00034] Num frames 28700... [2024-08-05 08:12:41,218][00034] Num frames 28800... [2024-08-05 08:12:41,429][00034] Num frames 28900... [2024-08-05 08:12:41,642][00034] Num frames 29000... [2024-08-05 08:12:41,855][00034] Num frames 29100... [2024-08-05 08:12:42,074][00034] Num frames 29200... [2024-08-05 08:12:42,285][00034] Num frames 29300... [2024-08-05 08:12:42,505][00034] Num frames 29400... [2024-08-05 08:12:42,723][00034] Num frames 29500... [2024-08-05 08:12:42,941][00034] Num frames 29600... [2024-08-05 08:12:43,157][00034] Num frames 29700... [2024-08-05 08:12:43,372][00034] Num frames 29800... [2024-08-05 08:12:43,593][00034] Num frames 29900... [2024-08-05 08:12:43,817][00034] Num frames 30000... [2024-08-05 08:12:44,046][00034] Num frames 30100... [2024-08-05 08:12:44,266][00034] Num frames 30200... [2024-08-05 08:12:44,535][00034] Num frames 30300... [2024-08-05 08:12:44,781][00034] Num frames 30400... [2024-08-05 08:12:45,023][00034] Num frames 30500... [2024-08-05 08:12:45,276][00034] Num frames 30600... [2024-08-05 08:12:45,510][00034] Num frames 30700... [2024-08-05 08:12:45,732][00034] Num frames 30800... [2024-08-05 08:12:45,951][00034] Num frames 30900... [2024-08-05 08:12:46,176][00034] Num frames 31000... [2024-08-05 08:12:46,397][00034] Num frames 31100... [2024-08-05 08:12:46,622][00034] Num frames 31200... [2024-08-05 08:12:46,836][00034] Num frames 31300... [2024-08-05 08:12:47,055][00034] Num frames 31400... [2024-08-05 08:12:47,268][00034] Num frames 31500... [2024-08-05 08:12:47,491][00034] Num frames 31600... [2024-08-05 08:12:47,706][00034] Num frames 31700... [2024-08-05 08:12:47,921][00034] Num frames 31800... [2024-08-05 08:12:48,142][00034] Num frames 31900... [2024-08-05 08:12:48,360][00034] Num frames 32000... [2024-08-05 08:12:48,581][00034] Num frames 32100... [2024-08-05 08:12:48,805][00034] Num frames 32200... [2024-08-05 08:12:49,037][00034] Num frames 32300... [2024-08-05 08:12:49,252][00034] Num frames 32400... [2024-08-05 08:12:49,480][00034] Num frames 32500... [2024-08-05 08:12:49,693][00034] Num frames 32600... [2024-08-05 08:12:49,911][00034] Num frames 32700... [2024-08-05 08:12:50,133][00034] Num frames 32800... [2024-08-05 08:12:50,350][00034] Num frames 32900... [2024-08-05 08:12:50,569][00034] Num frames 33000... [2024-08-05 08:12:50,794][00034] Num frames 33100... [2024-08-05 08:12:51,033][00034] Num frames 33200... [2024-08-05 08:12:51,259][00034] Num frames 33300... [2024-08-05 08:12:51,488][00034] Num frames 33400... [2024-08-05 08:12:51,715][00034] Num frames 33500... [2024-08-05 08:12:51,931][00034] DAMAGECOUNT value on done: 1308.0 [2024-08-05 08:12:51,932][00034] Sum rewards: 8.838, reward structure: {'DEATHCOUNT': '-9.000', 'HEALTH': '-3.540', 'AMMO2': '0.013', 'AMMO5': '0.021', 'AMMO4': '0.065', 'AMMO3': '0.139', 'HITCOUNT': '0.240', 'WEAPON5': '0.400', 'weapon5': '0.930', 'WEAPON3': '1.100', 'FRAGCOUNT': '1.500', 'DAMAGECOUNT': '1.728', 'weapon2': '6.252', 'weapon3': '8.990'} [2024-08-05 08:12:51,994][00034] Avg episode rewards: #0: 7.140, true rewards: #0: 1.250 [2024-08-05 08:12:51,995][00034] Avg episode reward: 7.140, avg true_objective: 1.250 [2024-08-05 08:14:33,411][00034] Replay video saved to /kaggle/working/train_dir/default_experiment/replay.mp4! [2024-08-05 08:15:19,621][00034] Loading existing experiment configuration from /kaggle/working/train_dir/default_experiment/config.json [2024-08-05 08:15:19,623][00034] Adding new argument 'no_render'=True that is not in the saved config file! [2024-08-05 08:15:19,623][00034] Adding new argument 'save_video'=True that is not in the saved config file! [2024-08-05 08:15:19,624][00034] Adding new argument 'video_frames'=1000000000.0 that is not in the saved config file! [2024-08-05 08:15:19,626][00034] Adding new argument 'video_name'=None that is not in the saved config file! [2024-08-05 08:15:19,626][00034] Adding new argument 'max_num_frames'=100000 that is not in the saved config file! [2024-08-05 08:15:19,627][00034] Adding new argument 'max_num_episodes'=4 that is not in the saved config file! [2024-08-05 08:15:19,628][00034] Adding new argument 'push_to_hub'=True that is not in the saved config file! [2024-08-05 08:15:19,629][00034] Adding new argument 'hf_repository'='Mojitrk/deathmatch-1-2' that is not in the saved config file! [2024-08-05 08:15:19,630][00034] Adding new argument 'policy_index'=0 that is not in the saved config file! [2024-08-05 08:15:19,631][00034] Adding new argument 'eval_deterministic'=False that is not in the saved config file! [2024-08-05 08:15:19,632][00034] Adding new argument 'train_script'=None that is not in the saved config file! [2024-08-05 08:15:19,633][00034] Adding new argument 'enjoy_script'=None that is not in the saved config file! [2024-08-05 08:15:19,634][00034] Using frameskip 1 and render_action_repeat=4 for evaluation [2024-08-05 08:15:19,664][00034] Port 40300 is available [2024-08-05 08:15:19,665][00034] Using port 40300 [2024-08-05 08:15:19,668][00034] RunningMeanStd input shape: (23,) [2024-08-05 08:15:19,669][00034] RunningMeanStd input shape: (3, 72, 128) [2024-08-05 08:15:19,670][00034] RunningMeanStd input shape: (1,) [2024-08-05 08:15:19,686][00034] ConvEncoder: input_channels=3 [2024-08-05 08:15:19,736][00034] Conv encoder output size: 512 [2024-08-05 08:15:19,738][00034] Policy head output size: 640 [2024-08-05 08:15:19,768][00034] Loading state from checkpoint /kaggle/working/train_dir/default_experiment/checkpoint_p0/checkpoint_000001222_10010624.pth... [2024-08-05 08:15:19,808][00034] Using port 40300 on host... [2024-08-05 08:15:20,172][00034] Initialized w:0 v:0 player:0 [2024-08-05 08:15:20,399][00034] Num frames 100... [2024-08-05 08:15:20,616][00034] Num frames 200... [2024-08-05 08:15:20,833][00034] Num frames 300... [2024-08-05 08:15:21,051][00034] Num frames 400... [2024-08-05 08:15:21,267][00034] Num frames 500... [2024-08-05 08:15:21,480][00034] Num frames 600... [2024-08-05 08:15:21,692][00034] Num frames 700... [2024-08-05 08:15:21,909][00034] Num frames 800... [2024-08-05 08:15:22,128][00034] Num frames 900... [2024-08-05 08:15:22,345][00034] Num frames 1000... [2024-08-05 08:15:22,562][00034] Num frames 1100... [2024-08-05 08:15:22,775][00034] Num frames 1200... [2024-08-05 08:15:22,993][00034] Num frames 1300... [2024-08-05 08:15:23,208][00034] Num frames 1400... [2024-08-05 08:15:23,429][00034] Num frames 1500... [2024-08-05 08:15:23,648][00034] Num frames 1600... [2024-08-05 08:15:23,912][00034] Num frames 1700... [2024-08-05 08:15:24,148][00034] Num frames 1800... [2024-08-05 08:15:24,373][00034] Num frames 1900... [2024-08-05 08:15:24,606][00034] Num frames 2000... [2024-08-05 08:15:24,832][00034] Num frames 2100... [2024-08-05 08:15:25,050][00034] Num frames 2200... [2024-08-05 08:15:25,264][00034] Num frames 2300... [2024-08-05 08:15:25,476][00034] Num frames 2400... [2024-08-05 08:15:25,699][00034] Num frames 2500... [2024-08-05 08:15:25,913][00034] Num frames 2600... [2024-08-05 08:15:26,133][00034] Num frames 2700... [2024-08-05 08:15:26,348][00034] Num frames 2800... [2024-08-05 08:15:26,572][00034] Num frames 2900... [2024-08-05 08:15:26,786][00034] Num frames 3000... [2024-08-05 08:15:26,999][00034] Num frames 3100... [2024-08-05 08:15:27,213][00034] Num frames 3200... [2024-08-05 08:15:27,428][00034] Num frames 3300... [2024-08-05 08:15:27,644][00034] Num frames 3400... [2024-08-05 08:15:27,855][00034] Num frames 3500... [2024-08-05 08:15:28,077][00034] Num frames 3600... [2024-08-05 08:15:28,318][00034] Num frames 3700... [2024-08-05 08:15:28,527][00034] Num frames 3800... [2024-08-05 08:15:28,753][00034] Num frames 3900... [2024-08-05 08:15:28,997][00034] Num frames 4000... [2024-08-05 08:15:29,214][00034] Num frames 4100... [2024-08-05 08:15:29,437][00034] Num frames 4200... [2024-08-05 08:15:29,656][00034] Num frames 4300... [2024-08-05 08:15:29,874][00034] Num frames 4400... [2024-08-05 08:15:30,096][00034] Num frames 4500... [2024-08-05 08:15:30,311][00034] Num frames 4600... [2024-08-05 08:15:30,530][00034] Num frames 4700... [2024-08-05 08:15:30,748][00034] Num frames 4800... [2024-08-05 08:15:30,969][00034] Num frames 4900... [2024-08-05 08:15:31,182][00034] Num frames 5000... [2024-08-05 08:15:31,397][00034] Num frames 5100... [2024-08-05 08:15:31,620][00034] Num frames 5200... [2024-08-05 08:15:31,836][00034] Num frames 5300... [2024-08-05 08:15:32,058][00034] Num frames 5400... [2024-08-05 08:15:32,280][00034] Num frames 5500... [2024-08-05 08:15:32,501][00034] Num frames 5600... [2024-08-05 08:15:32,725][00034] Num frames 5700... [2024-08-05 08:15:32,941][00034] Num frames 5800... [2024-08-05 08:15:33,160][00034] Num frames 5900... [2024-08-05 08:15:33,381][00034] Num frames 6000... [2024-08-05 08:15:33,603][00034] Num frames 6100... [2024-08-05 08:15:33,823][00034] Num frames 6200... [2024-08-05 08:15:34,042][00034] Num frames 6300... [2024-08-05 08:15:34,255][00034] Num frames 6400... [2024-08-05 08:15:34,477][00034] Num frames 6500... [2024-08-05 08:15:34,701][00034] Num frames 6600... [2024-08-05 08:15:34,915][00034] Num frames 6700... [2024-08-05 08:15:35,129][00034] Num frames 6800... [2024-08-05 08:15:35,353][00034] Num frames 6900... [2024-08-05 08:15:35,570][00034] Num frames 7000... [2024-08-05 08:15:35,786][00034] Num frames 7100... [2024-08-05 08:15:36,003][00034] Num frames 7200... [2024-08-05 08:15:36,216][00034] Num frames 7300... [2024-08-05 08:15:36,433][00034] Num frames 7400... [2024-08-05 08:15:36,653][00034] Num frames 7500... [2024-08-05 08:15:36,870][00034] Num frames 7600... [2024-08-05 08:15:37,094][00034] Num frames 7700... [2024-08-05 08:15:37,315][00034] Num frames 7800... [2024-08-05 08:15:37,551][00034] Num frames 7900... [2024-08-05 08:15:37,788][00034] Num frames 8000... [2024-08-05 08:15:38,019][00034] Num frames 8100... [2024-08-05 08:15:38,241][00034] Num frames 8200... [2024-08-05 08:15:38,465][00034] Num frames 8300... [2024-08-05 08:15:38,690][00034] DAMAGECOUNT value on done: 348.0 [2024-08-05 08:15:38,692][00034] Sum rewards: 12.514, reward structure: {'DEATHCOUNT': '-6.000', 'HEALTH': '-2.564', 'AMMO4': '-0.050', 'AMMO2': '-0.010', 'AMMO5': '0.020', 'AMMO3': '0.114', 'WEAPON5': '0.200', 'WEAPON4': '0.200', 'HITCOUNT': '0.250', 'ARMOR': '0.531', 'weapon5': '0.594', 'WEAPON3': '0.800', 'DAMAGECOUNT': '1.044', 'weapon4': '2.012', 'FRAGCOUNT': '4.000', 'weapon2': '4.186', 'weapon3': '7.186'} [2024-08-05 08:15:38,757][00034] Avg episode rewards: #0: 12.513, true rewards: #0: 4.000 [2024-08-05 08:15:38,758][00034] Avg episode reward: 12.513, avg true_objective: 4.000 [2024-08-05 08:15:38,763][00034] Num frames 8400... [2024-08-05 08:15:39,011][00034] Num frames 8500... [2024-08-05 08:15:39,232][00034] Num frames 8600... [2024-08-05 08:15:39,450][00034] Num frames 8700... [2024-08-05 08:15:39,670][00034] Num frames 8800... [2024-08-05 08:15:39,892][00034] Num frames 8900... [2024-08-05 08:15:40,110][00034] Num frames 9000... [2024-08-05 08:15:40,328][00034] Num frames 9100... [2024-08-05 08:15:40,544][00034] Num frames 9200... [2024-08-05 08:15:40,756][00034] Num frames 9300... [2024-08-05 08:15:40,968][00034] Num frames 9400... [2024-08-05 08:15:41,184][00034] Num frames 9500... [2024-08-05 08:15:41,400][00034] Num frames 9600... [2024-08-05 08:15:41,620][00034] Num frames 9700... [2024-08-05 08:15:41,837][00034] Num frames 9800... [2024-08-05 08:15:42,055][00034] Num frames 9900... [2024-08-05 08:15:42,275][00034] Num frames 10000... [2024-08-05 08:15:42,494][00034] Num frames 10100... [2024-08-05 08:15:42,718][00034] Num frames 10200... [2024-08-05 08:15:42,943][00034] Num frames 10300... [2024-08-05 08:15:43,169][00034] Num frames 10400... [2024-08-05 08:15:43,396][00034] Num frames 10500... [2024-08-05 08:15:43,617][00034] Num frames 10600... [2024-08-05 08:15:43,840][00034] Num frames 10700... [2024-08-05 08:15:44,064][00034] Num frames 10800... [2024-08-05 08:15:44,289][00034] Num frames 10900... [2024-08-05 08:15:44,506][00034] Num frames 11000... [2024-08-05 08:15:44,723][00034] Num frames 11100... [2024-08-05 08:15:44,947][00034] Num frames 11200... [2024-08-05 08:15:45,172][00034] Num frames 11300... [2024-08-05 08:15:45,391][00034] Num frames 11400... [2024-08-05 08:15:45,609][00034] Num frames 11500... [2024-08-05 08:15:45,832][00034] Num frames 11600... [2024-08-05 08:15:46,054][00034] Num frames 11700... [2024-08-05 08:15:46,282][00034] Num frames 11800... [2024-08-05 08:15:46,508][00034] Num frames 11900... [2024-08-05 08:15:46,731][00034] Num frames 12000... [2024-08-05 08:15:46,950][00034] Num frames 12100... [2024-08-05 08:15:47,171][00034] Num frames 12200... [2024-08-05 08:15:47,390][00034] Num frames 12300... [2024-08-05 08:15:47,610][00034] Num frames 12400... [2024-08-05 08:15:47,831][00034] Num frames 12500... [2024-08-05 08:15:48,045][00034] Num frames 12600... [2024-08-05 08:15:48,258][00034] Num frames 12700... [2024-08-05 08:15:48,477][00034] Num frames 12800... [2024-08-05 08:15:48,706][00034] Num frames 12900... [2024-08-05 08:15:48,948][00034] Num frames 13000... [2024-08-05 08:15:49,168][00034] Num frames 13100... [2024-08-05 08:15:49,388][00034] Num frames 13200... [2024-08-05 08:15:49,614][00034] Num frames 13300... [2024-08-05 08:15:49,850][00034] Num frames 13400... [2024-08-05 08:15:50,081][00034] Num frames 13500... [2024-08-05 08:15:50,302][00034] Num frames 13600... [2024-08-05 08:15:50,534][00034] Num frames 13700... [2024-08-05 08:15:50,762][00034] Num frames 13800... [2024-08-05 08:15:50,993][00034] Num frames 13900... [2024-08-05 08:15:51,217][00034] Num frames 14000... [2024-08-05 08:15:51,444][00034] Num frames 14100... [2024-08-05 08:15:51,670][00034] Num frames 14200... [2024-08-05 08:15:51,890][00034] Num frames 14300... [2024-08-05 08:15:52,064][00034] Large shaping reward -2.549 for [('FRAGCOUNT', -1.5, -1.0), ('DEATHCOUNT', -0.75, 1.0), ('HEALTH', -0.3, -100.0), ('AMMO5', -0.0005, -1.0), ('weapon5', 0.002)] [2024-08-05 08:15:52,115][00034] Num frames 14400... [2024-08-05 08:15:52,336][00034] Num frames 14500... [2024-08-05 08:15:52,567][00034] Num frames 14600... [2024-08-05 08:15:52,795][00034] Num frames 14700... [2024-08-05 08:15:53,019][00034] Num frames 14800... [2024-08-05 08:15:53,257][00034] Num frames 14900... [2024-08-05 08:15:53,485][00034] Num frames 15000... [2024-08-05 08:15:53,717][00034] Num frames 15100... [2024-08-05 08:15:53,951][00034] Num frames 15200... [2024-08-05 08:15:54,188][00034] Num frames 15300... [2024-08-05 08:15:54,420][00034] Num frames 15400... [2024-08-05 08:15:54,655][00034] Num frames 15500... [2024-08-05 08:15:54,870][00034] Num frames 15600... [2024-08-05 08:15:55,086][00034] Num frames 15700... [2024-08-05 08:15:55,359][00034] Num frames 15800... [2024-08-05 08:15:55,626][00034] Num frames 15900... [2024-08-05 08:15:55,858][00034] Num frames 16000... [2024-08-05 08:15:56,106][00034] Num frames 16100... [2024-08-05 08:15:56,345][00034] Num frames 16200... [2024-08-05 08:15:56,572][00034] Num frames 16300... [2024-08-05 08:15:56,797][00034] Num frames 16400... [2024-08-05 08:15:57,015][00034] Num frames 16500... [2024-08-05 08:15:57,242][00034] Num frames 16600... [2024-08-05 08:15:57,463][00034] Num frames 16700... [2024-08-05 08:15:57,681][00034] DAMAGECOUNT value on done: 508.0 [2024-08-05 08:15:57,683][00034] Sum rewards: 2.023, reward structure: {'DEATHCOUNT': '-11.250', 'HEALTH': '-4.330', 'FRAGCOUNT': '-0.500', 'AMMO5': '0.017', 'AMMO2': '0.020', 'WEAPON1': '0.020', 'AMMO4': '0.099', 'HITCOUNT': '0.140', 'AMMO3': '0.211', 'WEAPON5': '0.400', 'DAMAGECOUNT': '0.480', 'weapon5': '0.912', 'WEAPON3': '1.100', 'weapon2': '7.248', 'weapon3': '7.456'} [2024-08-05 08:15:57,745][00034] Avg episode rewards: #0: 7.268, true rewards: #0: 2.000 [2024-08-05 08:15:57,746][00034] Avg episode reward: 7.268, avg true_objective: 2.000 [2024-08-05 08:15:57,753][00034] Num frames 16800... [2024-08-05 08:15:57,977][00034] Num frames 16900... [2024-08-05 08:15:58,203][00034] Num frames 17000... [2024-08-05 08:15:58,429][00034] Num frames 17100... [2024-08-05 08:15:58,659][00034] Num frames 17200... [2024-08-05 08:15:58,922][00034] Num frames 17300... [2024-08-05 08:15:59,146][00034] Num frames 17400... [2024-08-05 08:15:59,361][00034] Num frames 17500... [2024-08-05 08:15:59,573][00034] Num frames 17600... [2024-08-05 08:15:59,796][00034] Num frames 17700... [2024-08-05 08:16:00,012][00034] Num frames 17800... [2024-08-05 08:16:00,227][00034] Num frames 17900... [2024-08-05 08:16:00,444][00034] Num frames 18000... [2024-08-05 08:16:00,664][00034] Num frames 18100... [2024-08-05 08:16:00,882][00034] Num frames 18200... [2024-08-05 08:16:01,105][00034] Num frames 18300... [2024-08-05 08:16:01,331][00034] Num frames 18400... [2024-08-05 08:16:01,551][00034] Num frames 18500... [2024-08-05 08:16:01,773][00034] Num frames 18600... [2024-08-05 08:16:01,990][00034] Num frames 18700... [2024-08-05 08:16:02,212][00034] Num frames 18800... [2024-08-05 08:16:02,443][00034] Num frames 18900... [2024-08-05 08:16:02,673][00034] Num frames 19000... [2024-08-05 08:16:02,894][00034] Num frames 19100... [2024-08-05 08:16:03,117][00034] Num frames 19200... [2024-08-05 08:16:03,342][00034] Num frames 19300... [2024-08-05 08:16:03,561][00034] Num frames 19400... [2024-08-05 08:16:03,780][00034] Num frames 19500... [2024-08-05 08:16:04,003][00034] Num frames 19600... [2024-08-05 08:16:04,236][00034] Num frames 19700... [2024-08-05 08:16:04,470][00034] Num frames 19800... [2024-08-05 08:16:04,700][00034] Num frames 19900... [2024-08-05 08:16:04,927][00034] Num frames 20000... [2024-08-05 08:16:05,157][00034] Num frames 20100... [2024-08-05 08:16:05,382][00034] Num frames 20200... [2024-08-05 08:16:05,605][00034] Num frames 20300... [2024-08-05 08:16:05,830][00034] Num frames 20400... [2024-08-05 08:16:06,046][00034] Num frames 20500... [2024-08-05 08:16:06,259][00034] Num frames 20600... [2024-08-05 08:16:06,474][00034] Num frames 20700... [2024-08-05 08:16:06,687][00034] Num frames 20800... [2024-08-05 08:16:06,902][00034] Num frames 20900... [2024-08-05 08:16:07,112][00034] Num frames 21000... [2024-08-05 08:16:07,330][00034] Num frames 21100... [2024-08-05 08:16:07,550][00034] Num frames 21200... [2024-08-05 08:16:07,771][00034] Num frames 21300... [2024-08-05 08:16:07,993][00034] Num frames 21400... [2024-08-05 08:16:08,211][00034] Num frames 21500... [2024-08-05 08:16:08,433][00034] Num frames 21600... [2024-08-05 08:16:08,654][00034] Num frames 21700... [2024-08-05 08:16:08,921][00034] Num frames 21800... [2024-08-05 08:16:09,138][00034] Num frames 21900... [2024-08-05 08:16:09,351][00034] Num frames 22000... [2024-08-05 08:16:09,568][00034] Num frames 22100... [2024-08-05 08:16:09,787][00034] Num frames 22200... [2024-08-05 08:16:10,000][00034] Num frames 22300... [2024-08-05 08:16:10,220][00034] Num frames 22400... [2024-08-05 08:16:10,442][00034] Num frames 22500... [2024-08-05 08:16:10,664][00034] Num frames 22600... [2024-08-05 08:16:10,900][00034] Num frames 22700... [2024-08-05 08:16:11,115][00034] Num frames 22800... [2024-08-05 08:16:11,330][00034] Num frames 22900... [2024-08-05 08:16:11,551][00034] Num frames 23000... [2024-08-05 08:16:11,771][00034] Num frames 23100... [2024-08-05 08:16:11,992][00034] Num frames 23200... [2024-08-05 08:16:12,211][00034] Num frames 23300... [2024-08-05 08:16:12,425][00034] Num frames 23400... [2024-08-05 08:16:12,644][00034] Num frames 23500... [2024-08-05 08:16:12,872][00034] Num frames 23600... [2024-08-05 08:16:13,100][00034] Num frames 23700... [2024-08-05 08:16:13,317][00034] Num frames 23800... [2024-08-05 08:16:13,531][00034] Num frames 23900... [2024-08-05 08:16:13,752][00034] Num frames 24000... [2024-08-05 08:16:13,969][00034] Num frames 24100... [2024-08-05 08:16:14,188][00034] Num frames 24200... [2024-08-05 08:16:14,413][00034] Num frames 24300... [2024-08-05 08:16:14,633][00034] Num frames 24400... [2024-08-05 08:16:14,849][00034] Num frames 24500... [2024-08-05 08:16:15,069][00034] Num frames 24600... [2024-08-05 08:16:15,283][00034] Num frames 24700... [2024-08-05 08:16:15,507][00034] Num frames 24800... [2024-08-05 08:16:15,731][00034] Num frames 24900... [2024-08-05 08:16:15,954][00034] Num frames 25000... [2024-08-05 08:16:16,169][00034] Num frames 25100... [2024-08-05 08:16:16,382][00034] DAMAGECOUNT value on done: 973.0 [2024-08-05 08:16:16,384][00034] Sum rewards: 7.632, reward structure: {'DEATHCOUNT': '-11.250', 'HEALTH': '-4.340', 'AMMO4': '-0.061', 'AMMO2': '-0.012', 'AMMO5': '0.015', 'AMMO3': '0.211', 'weapon5': '0.270', 'WEAPON5': '0.300', 'HITCOUNT': '0.370', 'WEAPON3': '1.200', 'DAMAGECOUNT': '1.395', 'FRAGCOUNT': '4.000', 'weapon3': '7.740', 'weapon2': '7.794'} [2024-08-05 08:16:16,446][00034] Avg episode rewards: #0: 7.389, true rewards: #0: 2.667 [2024-08-05 08:16:16,447][00034] Avg episode reward: 7.389, avg true_objective: 2.667 [2024-08-05 08:16:16,457][00034] Num frames 25200... [2024-08-05 08:16:16,677][00034] Num frames 25300... [2024-08-05 08:16:16,899][00034] Num frames 25400... [2024-08-05 08:16:17,118][00034] Num frames 25500... [2024-08-05 08:16:17,334][00034] Num frames 25600... [2024-08-05 08:16:17,555][00034] Num frames 25700... [2024-08-05 08:16:17,773][00034] Num frames 25800... [2024-08-05 08:16:17,995][00034] Num frames 25900... [2024-08-05 08:16:18,216][00034] Num frames 26000... [2024-08-05 08:16:18,440][00034] Num frames 26100... [2024-08-05 08:16:18,660][00034] Num frames 26200... [2024-08-05 08:16:18,905][00034] Num frames 26300... [2024-08-05 08:16:19,129][00034] Num frames 26400... [2024-08-05 08:16:19,345][00034] Num frames 26500... [2024-08-05 08:16:19,559][00034] Num frames 26600... [2024-08-05 08:16:19,785][00034] Num frames 26700... [2024-08-05 08:16:20,008][00034] Num frames 26800... [2024-08-05 08:16:20,224][00034] Num frames 26900... [2024-08-05 08:16:20,445][00034] Num frames 27000... [2024-08-05 08:16:20,666][00034] Num frames 27100... [2024-08-05 08:16:20,884][00034] Num frames 27200... [2024-08-05 08:16:21,102][00034] Num frames 27300... [2024-08-05 08:16:21,316][00034] Num frames 27400... [2024-08-05 08:16:21,538][00034] Num frames 27500... [2024-08-05 08:16:21,763][00034] Num frames 27600... [2024-08-05 08:16:21,993][00034] Num frames 27700... [2024-08-05 08:16:22,224][00034] Num frames 27800... [2024-08-05 08:16:22,439][00034] Num frames 27900... [2024-08-05 08:16:22,655][00034] Num frames 28000... [2024-08-05 08:16:22,879][00034] Num frames 28100... [2024-08-05 08:16:23,096][00034] Num frames 28200... [2024-08-05 08:16:23,319][00034] Num frames 28300... [2024-08-05 08:16:23,542][00034] Num frames 28400... [2024-08-05 08:16:23,765][00034] Num frames 28500... [2024-08-05 08:16:23,979][00034] Num frames 28600... [2024-08-05 08:16:24,198][00034] Num frames 28700... [2024-08-05 08:16:24,413][00034] Num frames 28800... [2024-08-05 08:16:24,639][00034] Num frames 28900... [2024-08-05 08:16:24,866][00034] Num frames 29000... [2024-08-05 08:16:25,088][00034] Num frames 29100... [2024-08-05 08:16:25,304][00034] Num frames 29200... [2024-08-05 08:16:25,526][00034] Num frames 29300... [2024-08-05 08:16:25,745][00034] Num frames 29400... [2024-08-05 08:16:25,964][00034] Num frames 29500... [2024-08-05 08:16:26,179][00034] Num frames 29600... [2024-08-05 08:16:26,399][00034] Num frames 29700... [2024-08-05 08:16:26,627][00034] Num frames 29800... [2024-08-05 08:16:26,876][00034] Num frames 29900... [2024-08-05 08:16:27,133][00034] Num frames 30000... [2024-08-05 08:16:27,359][00034] Num frames 30100... [2024-08-05 08:16:27,585][00034] Num frames 30200... [2024-08-05 08:16:27,816][00034] Num frames 30300... [2024-08-05 08:16:28,041][00034] Num frames 30400... [2024-08-05 08:16:28,261][00034] Num frames 30500... [2024-08-05 08:16:28,502][00034] Num frames 30600... [2024-08-05 08:16:28,728][00034] Num frames 30700... [2024-08-05 08:16:28,975][00034] Num frames 30800... [2024-08-05 08:16:29,200][00034] Num frames 30900... [2024-08-05 08:16:29,429][00034] Num frames 31000... [2024-08-05 08:16:29,646][00034] Num frames 31100... [2024-08-05 08:16:29,866][00034] Num frames 31200... [2024-08-05 08:16:30,080][00034] Num frames 31300... [2024-08-05 08:16:30,294][00034] Num frames 31400... [2024-08-05 08:16:30,513][00034] Num frames 31500... [2024-08-05 08:16:30,729][00034] Num frames 31600... [2024-08-05 08:16:30,946][00034] Num frames 31700... [2024-08-05 08:16:31,160][00034] Num frames 31800... [2024-08-05 08:16:31,378][00034] Num frames 31900... [2024-08-05 08:16:31,589][00034] Num frames 32000... [2024-08-05 08:16:31,810][00034] Num frames 32100... [2024-08-05 08:16:32,020][00034] Num frames 32200... [2024-08-05 08:16:32,237][00034] Num frames 32300... [2024-08-05 08:16:32,450][00034] Num frames 32400... [2024-08-05 08:16:32,661][00034] Num frames 32500... [2024-08-05 08:16:32,874][00034] Num frames 32600... [2024-08-05 08:16:33,089][00034] Num frames 32700... [2024-08-05 08:16:33,302][00034] Num frames 32800... [2024-08-05 08:16:33,517][00034] Num frames 32900... [2024-08-05 08:16:33,736][00034] Num frames 33000... [2024-08-05 08:16:33,952][00034] Num frames 33100... [2024-08-05 08:16:34,173][00034] Num frames 33200... [2024-08-05 08:16:34,386][00034] Num frames 33300... [2024-08-05 08:16:34,605][00034] Num frames 33400... [2024-08-05 08:16:34,817][00034] Num frames 33500... [2024-08-05 08:16:35,022][00034] DAMAGECOUNT value on done: 1146.0 [2024-08-05 08:16:35,084][00034] Avg episode rewards: #0: 6.912, true rewards: #0: 2.000 [2024-08-05 08:16:35,085][00034] Avg episode reward: 6.912, avg true_objective: 2.000 [2024-08-05 08:18:17,850][00034] Replay video saved to /kaggle/working/train_dir/default_experiment/replay.mp4!