Narkantak commited on
Commit
604bd31
1 Parent(s): 993656f

Narkantak/phi3-Intent-entity-Classifier-Ashuv2

Browse files
README.md CHANGED
@@ -16,7 +16,7 @@ should probably proofread and complete it, then remove this comment. -->
16
 
17
  This model is a fine-tuned version of [microsoft/Phi-3-mini-128k-instruct](https://huggingface.co/microsoft/Phi-3-mini-128k-instruct) on an unknown dataset.
18
  It achieves the following results on the evaluation set:
19
- - Loss: 0.5929
20
 
21
  ## Model description
22
 
@@ -51,32 +51,36 @@ The following hyperparameters were used during training:
51
 
52
  | Training Loss | Epoch | Step | Validation Loss |
53
  |:-------------:|:-----:|:----:|:---------------:|
54
- | 3.5322 | 0.86 | 3 | 2.4221 |
55
- | 1.7321 | 2.0 | 7 | 1.6896 |
56
- | 1.8073 | 2.86 | 10 | 1.3296 |
57
- | 0.9839 | 4.0 | 14 | 0.8705 |
58
- | 0.8891 | 4.86 | 17 | 0.6266 |
59
- | 0.4628 | 6.0 | 21 | 0.4525 |
60
- | 0.498 | 6.86 | 24 | 0.4093 |
61
- | 0.3318 | 8.0 | 28 | 0.3812 |
62
- | 0.396 | 8.86 | 31 | 0.3742 |
63
- | 0.2809 | 10.0 | 35 | 0.3603 |
64
- | 0.3487 | 10.86 | 38 | 0.3563 |
65
- | 0.2479 | 12.0 | 42 | 0.3621 |
66
- | 0.3085 | 12.86 | 45 | 0.3734 |
67
- | 0.2225 | 14.0 | 49 | 0.3733 |
68
- | 0.2716 | 14.86 | 52 | 0.3888 |
69
- | 0.1899 | 16.0 | 56 | 0.4287 |
70
- | 0.2319 | 16.86 | 59 | 0.4375 |
71
- | 0.1594 | 18.0 | 63 | 0.4491 |
72
- | 0.1928 | 18.86 | 66 | 0.4811 |
73
- | 0.1307 | 20.0 | 70 | 0.5047 |
74
- | 0.1577 | 20.86 | 73 | 0.5184 |
75
- | 0.1077 | 22.0 | 77 | 0.5539 |
76
- | 0.1333 | 22.86 | 80 | 0.5708 |
77
- | 0.0922 | 24.0 | 84 | 0.5795 |
78
- | 0.1167 | 24.86 | 87 | 0.5875 |
79
- | 0.0818 | 25.71 | 90 | 0.5929 |
 
 
 
 
80
 
81
 
82
  ### Framework versions
 
16
 
17
  This model is a fine-tuned version of [microsoft/Phi-3-mini-128k-instruct](https://huggingface.co/microsoft/Phi-3-mini-128k-instruct) on an unknown dataset.
18
  It achieves the following results on the evaluation set:
19
+ - Loss: 0.4364
20
 
21
  ## Model description
22
 
 
51
 
52
  | Training Loss | Epoch | Step | Validation Loss |
53
  |:-------------:|:-----:|:----:|:---------------:|
54
+ | 2.8501 | 1.0 | 3 | 2.2605 |
55
+ | 2.1387 | 2.0 | 6 | 1.7344 |
56
+ | 1.5826 | 3.0 | 9 | 1.3666 |
57
+ | 1.2187 | 4.0 | 12 | 1.0485 |
58
+ | 0.8879 | 5.0 | 15 | 0.7558 |
59
+ | 0.6134 | 6.0 | 18 | 0.5396 |
60
+ | 0.4343 | 7.0 | 21 | 0.4304 |
61
+ | 0.3557 | 8.0 | 24 | 0.3943 |
62
+ | 0.3205 | 9.0 | 27 | 0.3689 |
63
+ | 0.2947 | 10.0 | 30 | 0.3580 |
64
+ | 0.2727 | 11.0 | 33 | 0.3371 |
65
+ | 0.2506 | 12.0 | 36 | 0.3361 |
66
+ | 0.2291 | 13.0 | 39 | 0.3342 |
67
+ | 0.2098 | 14.0 | 42 | 0.3332 |
68
+ | 0.1911 | 15.0 | 45 | 0.3446 |
69
+ | 0.1761 | 16.0 | 48 | 0.3334 |
70
+ | 0.159 | 17.0 | 51 | 0.3453 |
71
+ | 0.1399 | 18.0 | 54 | 0.3540 |
72
+ | 0.124 | 19.0 | 57 | 0.3631 |
73
+ | 0.1123 | 20.0 | 60 | 0.3636 |
74
+ | 0.0992 | 21.0 | 63 | 0.3778 |
75
+ | 0.0862 | 22.0 | 66 | 0.3862 |
76
+ | 0.0783 | 23.0 | 69 | 0.3966 |
77
+ | 0.0704 | 24.0 | 72 | 0.4072 |
78
+ | 0.0627 | 25.0 | 75 | 0.4178 |
79
+ | 0.0582 | 26.0 | 78 | 0.4200 |
80
+ | 0.0553 | 27.0 | 81 | 0.4283 |
81
+ | 0.0521 | 28.0 | 84 | 0.4338 |
82
+ | 0.0505 | 29.0 | 87 | 0.4366 |
83
+ | 0.0494 | 30.0 | 90 | 0.4364 |
84
 
85
 
86
  ### Framework versions
adapter_config.json CHANGED
@@ -1,7 +1,7 @@
1
  {
2
  "alpha_pattern": {},
3
  "auto_mapping": null,
4
- "base_model_name_or_path": "microsoft/Phi-3-mini-128k-instruct",
5
  "bias": "none",
6
  "fan_in_fan_out": false,
7
  "inference_mode": true,
@@ -20,8 +20,8 @@
20
  "rank_pattern": {},
21
  "revision": null,
22
  "target_modules": [
23
- "qkv_proj",
24
- "o_proj"
25
  ],
26
  "task_type": "CAUSAL_LM",
27
  "use_dora": false,
 
1
  {
2
  "alpha_pattern": {},
3
  "auto_mapping": null,
4
+ "base_model_name_or_path": null,
5
  "bias": "none",
6
  "fan_in_fan_out": false,
7
  "inference_mode": true,
 
20
  "rank_pattern": {},
21
  "revision": null,
22
  "target_modules": [
23
+ "o_proj",
24
+ "qkv_proj"
25
  ],
26
  "task_type": "CAUSAL_LM",
27
  "use_dora": false,
adapter_model.safetensors CHANGED
@@ -1,3 +1,3 @@
1
  version https://git-lfs.github.com/spec/v1
2
- oid sha256:0d6ada3d9b1a06475f8be9f823475b21849a01e204a30b6f70a33c07408fac8d
3
- size 18891496
 
1
  version https://git-lfs.github.com/spec/v1
2
+ oid sha256:1910f9b5a437d9925cecc5281f494a55bd6cc6308bce807a9a5e43ebe8d650f1
3
+ size 18893672
runs/Apr30_13-43-08_fde755c1ca53/events.out.tfevents.1714484590.fde755c1ca53.34.0 ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:7d9725de9be8d60030fd19f986b933ea552bcc842aec44e8f646ed3feb70cb39
3
+ size 21944
training_args.bin CHANGED
@@ -1,3 +1,3 @@
1
  version https://git-lfs.github.com/spec/v1
2
- oid sha256:3b90fab03664f6c14713c918fe09e24e9a0292f4a63fadedda32f2a069eb3f9a
3
  size 4920
 
1
  version https://git-lfs.github.com/spec/v1
2
+ oid sha256:de4d7ffffc0db86285db4fd4472cccf37fb3b759ccdfd22c0b37a2536c56c234
3
  size 4920
wandb/debug-internal.log CHANGED
The diff for this file is too large to render. See raw diff
 
wandb/debug.log CHANGED
@@ -1,40 +1,43 @@
1
- 2024-04-30 12:03:37,855 INFO MainThread:34 [wandb_setup.py:_flush():76] Current SDK version is 0.16.6
2
- 2024-04-30 12:03:37,855 INFO MainThread:34 [wandb_setup.py:_flush():76] Configure stats pid to 34
3
- 2024-04-30 12:03:37,855 INFO MainThread:34 [wandb_setup.py:_flush():76] Loading settings from /root/.config/wandb/settings
4
- 2024-04-30 12:03:37,856 INFO MainThread:34 [wandb_setup.py:_flush():76] Loading settings from /kaggle/working/wandb/settings
5
- 2024-04-30 12:03:37,856 INFO MainThread:34 [wandb_setup.py:_flush():76] Loading settings from environment variables: {}
6
- 2024-04-30 12:03:37,856 INFO MainThread:34 [wandb_setup.py:_flush():76] Applying setup settings: {'_disable_service': False}
7
- 2024-04-30 12:03:37,856 INFO MainThread:34 [wandb_setup.py:_flush():76] Inferring run settings from compute environment: {'program': '<python with no main file>'}
8
- 2024-04-30 12:03:37,856 INFO MainThread:34 [wandb_setup.py:_flush():76] Applying login settings: {}
9
- 2024-04-30 12:03:37,856 INFO MainThread:34 [wandb_setup.py:_flush():76] Applying login settings: {'api_key': '***REDACTED***'}
10
- 2024-04-30 12:03:37,856 INFO MainThread:34 [wandb_init.py:_log_setup():521] Logging user logs to /kaggle/working/wandb/run-20240430_120337-od3lqfzi/logs/debug.log
11
- 2024-04-30 12:03:37,856 INFO MainThread:34 [wandb_init.py:_log_setup():522] Logging internal logs to /kaggle/working/wandb/run-20240430_120337-od3lqfzi/logs/debug-internal.log
12
- 2024-04-30 12:03:37,856 INFO MainThread:34 [wandb_init.py:_jupyter_setup():467] configuring jupyter hooks <wandb.sdk.wandb_init._WandbInit object at 0x7ea8912229b0>
13
- 2024-04-30 12:03:37,856 INFO MainThread:34 [wandb_init.py:init():561] calling init triggers
14
- 2024-04-30 12:03:37,856 INFO MainThread:34 [wandb_init.py:init():568] wandb.init called with sweep_config: {}
15
  config: {}
16
- 2024-04-30 12:03:37,857 INFO MainThread:34 [wandb_init.py:init():611] starting backend
17
- 2024-04-30 12:03:37,857 INFO MainThread:34 [wandb_init.py:init():615] setting up manager
18
- 2024-04-30 12:03:37,858 INFO MainThread:34 [backend.py:_multiprocessing_setup():105] multiprocessing start_methods=fork,spawn,forkserver, using: spawn
19
- 2024-04-30 12:03:37,861 INFO MainThread:34 [wandb_init.py:init():623] backend started and connected
20
- 2024-04-30 12:03:37,872 INFO MainThread:34 [wandb_run.py:_label_probe_notebook():1299] probe notebook
21
- 2024-04-30 12:03:38,252 INFO MainThread:34 [wandb_init.py:init():715] updated telemetry
22
- 2024-04-30 12:03:38,256 INFO MainThread:34 [wandb_init.py:init():748] communicating run to backend with 90.0 second timeout
23
- 2024-04-30 12:03:38,471 INFO MainThread:34 [wandb_run.py:_on_init():2357] communicating current version
24
- 2024-04-30 12:03:38,555 INFO MainThread:34 [wandb_run.py:_on_init():2366] got version response
25
- 2024-04-30 12:03:38,556 INFO MainThread:34 [wandb_init.py:init():799] starting run threads in backend
26
- 2024-04-30 12:03:54,574 INFO MainThread:34 [wandb_run.py:_console_start():2335] atexit reg
27
- 2024-04-30 12:03:54,574 INFO MainThread:34 [wandb_run.py:_redirect():2190] redirect: wrap_raw
28
- 2024-04-30 12:03:54,575 INFO MainThread:34 [wandb_run.py:_redirect():2255] Wrapping output streams.
29
- 2024-04-30 12:03:54,575 INFO MainThread:34 [wandb_run.py:_redirect():2280] Redirects installed.
30
- 2024-04-30 12:03:54,576 INFO MainThread:34 [wandb_init.py:init():842] run started, returning control to user process
31
- 2024-04-30 12:03:54,582 INFO MainThread:34 [wandb_run.py:_config_callback():1347] config_cb None None {'vocab_size': 32064, 'hidden_size': 3072, 'intermediate_size': 8192, 'num_hidden_layers': 32, 'num_attention_heads': 32, 'num_key_value_heads': 32, 'resid_pdrop': 0.0, 'embd_pdrop': 0.0, 'attention_dropout': 0.0, 'hidden_act': 'silu', 'max_position_embeddings': 131072, 'original_max_position_embeddings': 4096, 'initializer_range': 0.02, 'rms_norm_eps': 1e-05, 'use_cache': False, 'rope_theta': 10000.0, 'rope_scaling': {'long_factor': [1.0299999713897705, 1.0499999523162842, 1.0499999523162842, 1.0799999237060547, 1.2299998998641968, 1.2299998998641968, 1.2999999523162842, 1.4499999284744263, 1.5999999046325684, 1.6499998569488525, 1.8999998569488525, 2.859999895095825, 3.68999981880188, 5.419999599456787, 5.489999771118164, 5.489999771118164, 9.09000015258789, 11.579999923706055, 15.65999984741211, 15.769999504089355, 15.789999961853027, 18.360000610351562, 21.989999771118164, 23.079999923706055, 30.009998321533203, 32.35000228881836, 32.590003967285156, 35.56000518798828, 39.95000457763672, 53.840003967285156, 56.20000457763672, 57.95000457763672, 59.29000473022461, 59.77000427246094, 59.920005798339844, 61.190006256103516, 61.96000671386719, 62.50000762939453, 63.3700065612793, 63.48000717163086, 63.48000717163086, 63.66000747680664, 63.850006103515625, 64.08000946044922, 64.760009765625, 64.80001068115234, 64.81001281738281, 64.81001281738281], 'short_factor': [1.05, 1.05, 1.05, 1.1, 1.1, 1.1500000000000001, 1.2000000000000002, 1.2500000000000002, 1.3000000000000003, 1.3500000000000003, 1.5000000000000004, 2.000000000000001, 2.000000000000001, 2.000000000000001, 2.000000000000001, 2.000000000000001, 2.000000000000001, 2.000000000000001, 2.000000000000001, 2.000000000000001, 2.000000000000001, 2.000000000000001, 2.000000000000001, 2.000000000000001, 2.000000000000001, 2.000000000000001, 2.000000000000001, 2.000000000000001, 2.000000000000001, 2.000000000000001, 2.000000000000001, 2.000000000000001, 2.0500000000000007, 2.0500000000000007, 2.0500000000000007, 2.1000000000000005, 2.1000000000000005, 2.1000000000000005, 2.1500000000000004, 2.1500000000000004, 2.3499999999999996, 2.549999999999999, 2.5999999999999988, 2.5999999999999988, 2.7499999999999982, 2.849999999999998, 2.849999999999998, 2.9499999999999975], 'type': 'su'}, 'sliding_window': 262144, 'return_dict': True, 'output_hidden_states': False, 'output_attentions': False, 'torchscript': False, 'torch_dtype': 'bfloat16', 'use_bfloat16': False, 'tf_legacy_loss': False, 'pruned_heads': {}, 'tie_word_embeddings': False, 'chunk_size_feed_forward': 0, 'is_encoder_decoder': False, 'is_decoder': False, 'cross_attention_hidden_size': None, 'add_cross_attention': False, 'tie_encoder_decoder': False, 'max_length': 20, 'min_length': 0, 'do_sample': False, 'early_stopping': False, 'num_beams': 1, 'num_beam_groups': 1, 'diversity_penalty': 0.0, 'temperature': 1.0, 'top_k': 50, 'top_p': 1.0, 'typical_p': 1.0, 'repetition_penalty': 1.0, 'length_penalty': 1.0, 'no_repeat_ngram_size': 0, 'encoder_no_repeat_ngram_size': 0, 'bad_words_ids': None, 'num_return_sequences': 1, 'output_scores': False, 'return_dict_in_generate': False, 'forced_bos_token_id': None, 'forced_eos_token_id': None, 'remove_invalid_values': False, 'exponential_decay_length_penalty': None, 'suppress_tokens': None, 'begin_suppress_tokens': None, 'architectures': ['Phi3ForCausalLM'], 'finetuning_task': None, 'id2label': {0: 'LABEL_0', 1: 'LABEL_1'}, 'label2id': {'LABEL_0': 0, 'LABEL_1': 1}, 'tokenizer_class': None, 'prefix': None, 'bos_token_id': 1, 'pad_token_id': 32000, 'eos_token_id': 32000, 'sep_token_id': None, 'decoder_start_token_id': None, 'task_specific_params': None, 'problem_type': None, '_name_or_path': 'microsoft/Phi-3-mini-128k-instruct', 'transformers_version': '4.39.3', 'auto_map': {'AutoConfig': 'microsoft/Phi-3-mini-128k-instruct--configuration_phi3.Phi3Config', 'AutoModelForCausalLM': 'microsoft/Phi-3-mini-128k-instruct--modeling_phi3.Phi3ForCausalLM'}, 'model_type': 'phi3', 'output_dir': '/kaggle/working/', 'overwrite_output_dir': False, 'do_train': False, 'do_eval': True, 'do_predict': False, 'evaluation_strategy': 'epoch', 'prediction_loss_only': False, 'per_device_train_batch_size': 6, 'per_device_eval_batch_size': 6, 'per_gpu_train_batch_size': None, 'per_gpu_eval_batch_size': None, 'gradient_accumulation_steps': 4, 'eval_accumulation_steps': None, 'eval_delay': 0, 'learning_rate': 0.0002, 'weight_decay': 0.01, 'adam_beta1': 0.9, 'adam_beta2': 0.999, 'adam_epsilon': 1e-08, 'max_grad_norm': 1.0, 'num_train_epochs': 30, 'max_steps': -1, 'lr_scheduler_type': 'linear', 'lr_scheduler_kwargs': {}, 'warmup_ratio': 0.0, 'warmup_steps': 2, 'log_level': 'passive', 'log_level_replica': 'warning', 'log_on_each_node': True, 'logging_dir': '/kaggle/working/runs/Apr30_12-03-27_2f7b60d19abc', 'logging_strategy': 'epoch', 'logging_first_step': False, 'logging_steps': 500, 'logging_nan_inf_filter': True, 'save_strategy': 'epoch', 'save_steps': 500, 'save_total_limit': None, 'save_safetensors': True, 'save_on_each_node': False, 'save_only_model': False, 'no_cuda': False, 'use_cpu': False, 'use_mps_device': False, 'seed': 42, 'data_seed': None, 'jit_mode_eval': False, 'use_ipex': False, 'bf16': False, 'fp16': True, 'fp16_opt_level': 'O1', 'half_precision_backend': 'auto', 'bf16_full_eval': False, 'fp16_full_eval': False, 'tf32': None, 'local_rank': 0, 'ddp_backend': None, 'tpu_num_cores': None, 'tpu_metrics_debug': False, 'debug': [], 'dataloader_drop_last': False, 'eval_steps': None, 'dataloader_num_workers': 0, 'dataloader_prefetch_factor': None, 'past_index': -1, 'run_name': '/kaggle/working/', 'disable_tqdm': False, 'remove_unused_columns': True, 'label_names': None, 'load_best_model_at_end': True, 'metric_for_best_model': 'loss', 'greater_is_better': False, 'ignore_data_skip': False, 'fsdp': [], 'fsdp_min_num_params': 0, 'fsdp_config': {'min_num_params': 0, 'xla': False, 'xla_fsdp_v2': False, 'xla_fsdp_grad_ckpt': False}, 'fsdp_transformer_layer_cls_to_wrap': None, 'accelerator_config': {'split_batches': False, 'dispatch_batches': None, 'even_batches': True, 'use_seedable_sampler': True}, 'deepspeed': None, 'label_smoothing_factor': 0.0, 'optim': 'paged_adamw_8bit', 'optim_args': None, 'adafactor': False, 'group_by_length': False, 'length_column_name': 'length', 'report_to': ['tensorboard', 'wandb'], 'ddp_find_unused_parameters': None, 'ddp_bucket_cap_mb': None, 'ddp_broadcast_buffers': None, 'dataloader_pin_memory': True, 'dataloader_persistent_workers': False, 'skip_memory_metrics': True, 'use_legacy_prediction_loop': False, 'push_to_hub': False, 'resume_from_checkpoint': None, 'hub_model_id': None, 'hub_strategy': 'every_save', 'hub_token': '<HUB_TOKEN>', 'hub_private_repo': False, 'hub_always_push': False, 'gradient_checkpointing': False, 'gradient_checkpointing_kwargs': None, 'include_inputs_for_metrics': False, 'fp16_backend': 'auto', 'push_to_hub_model_id': None, 'push_to_hub_organization': None, 'push_to_hub_token': '<PUSH_TO_HUB_TOKEN>', 'mp_parameters': '', 'auto_find_batch_size': False, 'full_determinism': False, 'torchdynamo': None, 'ray_scope': 'last', 'ddp_timeout': 1800, 'torch_compile': False, 'torch_compile_backend': None, 'torch_compile_mode': None, 'dispatch_batches': None, 'split_batches': None, 'include_tokens_per_second': False, 'include_num_input_tokens_seen': False, 'neftune_noise_alpha': None, 'optim_target_modules': None}
32
- 2024-04-30 12:13:09,593 INFO MainThread:34 [jupyter.py:save_ipynb():373] not saving jupyter notebook
33
- 2024-04-30 12:13:09,593 INFO MainThread:34 [wandb_init.py:_pause_backend():432] pausing backend
34
- 2024-04-30 12:13:18,560 INFO MainThread:34 [wandb_init.py:_resume_backend():437] resuming backend
35
- 2024-04-30 12:13:18,587 INFO MainThread:34 [jupyter.py:save_ipynb():373] not saving jupyter notebook
36
- 2024-04-30 12:13:18,587 INFO MainThread:34 [wandb_init.py:_pause_backend():432] pausing backend
37
- 2024-04-30 12:13:54,173 INFO MainThread:34 [wandb_init.py:_resume_backend():437] resuming backend
38
- 2024-04-30 12:13:54,174 INFO MainThread:34 [jupyter.py:save_ipynb():373] not saving jupyter notebook
39
- 2024-04-30 12:13:54,174 INFO MainThread:34 [wandb_init.py:_pause_backend():432] pausing backend
40
- 2024-04-30 12:13:54,388 INFO MainThread:34 [wandb_init.py:_resume_backend():437] resuming backend
 
 
 
 
1
+ 2024-04-30 13:43:30,424 INFO MainThread:34 [wandb_setup.py:_flush():76] Current SDK version is 0.16.6
2
+ 2024-04-30 13:43:30,424 INFO MainThread:34 [wandb_setup.py:_flush():76] Configure stats pid to 34
3
+ 2024-04-30 13:43:30,424 INFO MainThread:34 [wandb_setup.py:_flush():76] Loading settings from /root/.config/wandb/settings
4
+ 2024-04-30 13:43:30,424 INFO MainThread:34 [wandb_setup.py:_flush():76] Loading settings from /kaggle/working/wandb/settings
5
+ 2024-04-30 13:43:30,424 INFO MainThread:34 [wandb_setup.py:_flush():76] Loading settings from environment variables: {}
6
+ 2024-04-30 13:43:30,424 INFO MainThread:34 [wandb_setup.py:_flush():76] Applying setup settings: {'_disable_service': False}
7
+ 2024-04-30 13:43:30,424 INFO MainThread:34 [wandb_setup.py:_flush():76] Inferring run settings from compute environment: {'program': '<python with no main file>'}
8
+ 2024-04-30 13:43:30,424 INFO MainThread:34 [wandb_setup.py:_flush():76] Applying login settings: {}
9
+ 2024-04-30 13:43:30,424 INFO MainThread:34 [wandb_setup.py:_flush():76] Applying login settings: {'api_key': '***REDACTED***'}
10
+ 2024-04-30 13:43:30,424 INFO MainThread:34 [wandb_init.py:_log_setup():521] Logging user logs to /kaggle/working/wandb/run-20240430_134330-p3yujlfg/logs/debug.log
11
+ 2024-04-30 13:43:30,424 INFO MainThread:34 [wandb_init.py:_log_setup():522] Logging internal logs to /kaggle/working/wandb/run-20240430_134330-p3yujlfg/logs/debug-internal.log
12
+ 2024-04-30 13:43:30,425 INFO MainThread:34 [wandb_init.py:_jupyter_setup():467] configuring jupyter hooks <wandb.sdk.wandb_init._WandbInit object at 0x781a89bef460>
13
+ 2024-04-30 13:43:30,425 INFO MainThread:34 [wandb_init.py:init():561] calling init triggers
14
+ 2024-04-30 13:43:30,425 INFO MainThread:34 [wandb_init.py:init():568] wandb.init called with sweep_config: {}
15
  config: {}
16
+ 2024-04-30 13:43:30,425 INFO MainThread:34 [wandb_init.py:init():611] starting backend
17
+ 2024-04-30 13:43:30,425 INFO MainThread:34 [wandb_init.py:init():615] setting up manager
18
+ 2024-04-30 13:43:30,427 INFO MainThread:34 [backend.py:_multiprocessing_setup():105] multiprocessing start_methods=fork,spawn,forkserver, using: spawn
19
+ 2024-04-30 13:43:30,430 INFO MainThread:34 [wandb_init.py:init():623] backend started and connected
20
+ 2024-04-30 13:43:30,442 INFO MainThread:34 [wandb_run.py:_label_probe_notebook():1299] probe notebook
21
+ 2024-04-30 13:43:30,930 INFO MainThread:34 [wandb_init.py:init():715] updated telemetry
22
+ 2024-04-30 13:43:30,933 INFO MainThread:34 [wandb_init.py:init():748] communicating run to backend with 90.0 second timeout
23
+ 2024-04-30 13:43:31,069 INFO MainThread:34 [wandb_run.py:_on_init():2357] communicating current version
24
+ 2024-04-30 13:43:31,155 INFO MainThread:34 [wandb_run.py:_on_init():2366] got version response
25
+ 2024-04-30 13:43:31,157 INFO MainThread:34 [wandb_init.py:init():799] starting run threads in backend
26
+ 2024-04-30 13:43:47,243 INFO MainThread:34 [wandb_run.py:_console_start():2335] atexit reg
27
+ 2024-04-30 13:43:47,244 INFO MainThread:34 [wandb_run.py:_redirect():2190] redirect: wrap_raw
28
+ 2024-04-30 13:43:47,245 INFO MainThread:34 [wandb_run.py:_redirect():2255] Wrapping output streams.
29
+ 2024-04-30 13:43:47,245 INFO MainThread:34 [wandb_run.py:_redirect():2280] Redirects installed.
30
+ 2024-04-30 13:43:47,246 INFO MainThread:34 [wandb_init.py:init():842] run started, returning control to user process
31
+ 2024-04-30 13:43:47,253 INFO MainThread:34 [wandb_run.py:_config_callback():1347] config_cb None None {'vocab_size': 32064, 'hidden_size': 3072, 'intermediate_size': 8192, 'num_hidden_layers': 32, 'num_attention_heads': 32, 'num_key_value_heads': 32, 'resid_pdrop': 0.0, 'embd_pdrop': 0.0, 'attention_dropout': 0.0, 'hidden_act': 'silu', 'max_position_embeddings': 131072, 'original_max_position_embeddings': 4096, 'initializer_range': 0.02, 'rms_norm_eps': 1e-05, 'use_cache': False, 'rope_theta': 10000.0, 'rope_scaling': {'long_factor': [1.0299999713897705, 1.0499999523162842, 1.0499999523162842, 1.0799999237060547, 1.2299998998641968, 1.2299998998641968, 1.2999999523162842, 1.4499999284744263, 1.5999999046325684, 1.6499998569488525, 1.8999998569488525, 2.859999895095825, 3.68999981880188, 5.419999599456787, 5.489999771118164, 5.489999771118164, 9.09000015258789, 11.579999923706055, 15.65999984741211, 15.769999504089355, 15.789999961853027, 18.360000610351562, 21.989999771118164, 23.079999923706055, 30.009998321533203, 32.35000228881836, 32.590003967285156, 35.56000518798828, 39.95000457763672, 53.840003967285156, 56.20000457763672, 57.95000457763672, 59.29000473022461, 59.77000427246094, 59.920005798339844, 61.190006256103516, 61.96000671386719, 62.50000762939453, 63.3700065612793, 63.48000717163086, 63.48000717163086, 63.66000747680664, 63.850006103515625, 64.08000946044922, 64.760009765625, 64.80001068115234, 64.81001281738281, 64.81001281738281], 'short_factor': [1.05, 1.05, 1.05, 1.1, 1.1, 1.1500000000000001, 1.2000000000000002, 1.2500000000000002, 1.3000000000000003, 1.3500000000000003, 1.5000000000000004, 2.000000000000001, 2.000000000000001, 2.000000000000001, 2.000000000000001, 2.000000000000001, 2.000000000000001, 2.000000000000001, 2.000000000000001, 2.000000000000001, 2.000000000000001, 2.000000000000001, 2.000000000000001, 2.000000000000001, 2.000000000000001, 2.000000000000001, 2.000000000000001, 2.000000000000001, 2.000000000000001, 2.000000000000001, 2.000000000000001, 2.000000000000001, 2.0500000000000007, 2.0500000000000007, 2.0500000000000007, 2.1000000000000005, 2.1000000000000005, 2.1000000000000005, 2.1500000000000004, 2.1500000000000004, 2.3499999999999996, 2.549999999999999, 2.5999999999999988, 2.5999999999999988, 2.7499999999999982, 2.849999999999998, 2.849999999999998, 2.9499999999999975], 'type': 'su'}, 'sliding_window': 262144, 'return_dict': True, 'output_hidden_states': False, 'output_attentions': False, 'torchscript': False, 'torch_dtype': 'bfloat16', 'use_bfloat16': False, 'tf_legacy_loss': False, 'pruned_heads': {}, 'tie_word_embeddings': False, 'chunk_size_feed_forward': 0, 'is_encoder_decoder': False, 'is_decoder': False, 'cross_attention_hidden_size': None, 'add_cross_attention': False, 'tie_encoder_decoder': False, 'max_length': 20, 'min_length': 0, 'do_sample': False, 'early_stopping': False, 'num_beams': 1, 'num_beam_groups': 1, 'diversity_penalty': 0.0, 'temperature': 1.0, 'top_k': 50, 'top_p': 1.0, 'typical_p': 1.0, 'repetition_penalty': 1.0, 'length_penalty': 1.0, 'no_repeat_ngram_size': 0, 'encoder_no_repeat_ngram_size': 0, 'bad_words_ids': None, 'num_return_sequences': 1, 'output_scores': False, 'return_dict_in_generate': False, 'forced_bos_token_id': None, 'forced_eos_token_id': None, 'remove_invalid_values': False, 'exponential_decay_length_penalty': None, 'suppress_tokens': None, 'begin_suppress_tokens': None, 'architectures': ['Phi3ForCausalLM'], 'finetuning_task': None, 'id2label': {0: 'LABEL_0', 1: 'LABEL_1'}, 'label2id': {'LABEL_0': 0, 'LABEL_1': 1}, 'tokenizer_class': None, 'prefix': None, 'bos_token_id': 1, 'pad_token_id': 32000, 'eos_token_id': 32000, 'sep_token_id': None, 'decoder_start_token_id': None, 'task_specific_params': None, 'problem_type': None, '_name_or_path': 'microsoft/Phi-3-mini-128k-instruct', 'transformers_version': '4.39.3', 'auto_map': {'AutoConfig': 'microsoft/Phi-3-mini-128k-instruct--configuration_phi3.Phi3Config', 'AutoModelForCausalLM': 'microsoft/Phi-3-mini-128k-instruct--modeling_phi3.Phi3ForCausalLM'}, 'model_type': 'phi3', 'output_dir': '/kaggle/working/', 'overwrite_output_dir': False, 'do_train': False, 'do_eval': True, 'do_predict': False, 'evaluation_strategy': 'epoch', 'prediction_loss_only': False, 'per_device_train_batch_size': 6, 'per_device_eval_batch_size': 6, 'per_gpu_train_batch_size': None, 'per_gpu_eval_batch_size': None, 'gradient_accumulation_steps': 4, 'eval_accumulation_steps': None, 'eval_delay': 0, 'learning_rate': 0.0002, 'weight_decay': 0.01, 'adam_beta1': 0.9, 'adam_beta2': 0.999, 'adam_epsilon': 1e-08, 'max_grad_norm': 1.0, 'num_train_epochs': 30, 'max_steps': -1, 'lr_scheduler_type': 'linear', 'lr_scheduler_kwargs': {}, 'warmup_ratio': 0.0, 'warmup_steps': 2, 'log_level': 'passive', 'log_level_replica': 'warning', 'log_on_each_node': True, 'logging_dir': '/kaggle/working/runs/Apr30_13-43-08_fde755c1ca53', 'logging_strategy': 'epoch', 'logging_first_step': False, 'logging_steps': 500, 'logging_nan_inf_filter': True, 'save_strategy': 'epoch', 'save_steps': 500, 'save_total_limit': None, 'save_safetensors': True, 'save_on_each_node': False, 'save_only_model': False, 'no_cuda': False, 'use_cpu': False, 'use_mps_device': False, 'seed': 42, 'data_seed': None, 'jit_mode_eval': False, 'use_ipex': False, 'bf16': False, 'fp16': True, 'fp16_opt_level': 'O1', 'half_precision_backend': 'auto', 'bf16_full_eval': False, 'fp16_full_eval': False, 'tf32': None, 'local_rank': 0, 'ddp_backend': None, 'tpu_num_cores': None, 'tpu_metrics_debug': False, 'debug': [], 'dataloader_drop_last': False, 'eval_steps': None, 'dataloader_num_workers': 0, 'dataloader_prefetch_factor': None, 'past_index': -1, 'run_name': '/kaggle/working/', 'disable_tqdm': False, 'remove_unused_columns': True, 'label_names': None, 'load_best_model_at_end': True, 'metric_for_best_model': 'loss', 'greater_is_better': False, 'ignore_data_skip': False, 'fsdp': [], 'fsdp_min_num_params': 0, 'fsdp_config': {'min_num_params': 0, 'xla': False, 'xla_fsdp_v2': False, 'xla_fsdp_grad_ckpt': False}, 'fsdp_transformer_layer_cls_to_wrap': None, 'accelerator_config': {'split_batches': False, 'dispatch_batches': None, 'even_batches': True, 'use_seedable_sampler': True}, 'deepspeed': None, 'label_smoothing_factor': 0.0, 'optim': 'paged_adamw_8bit', 'optim_args': None, 'adafactor': False, 'group_by_length': False, 'length_column_name': 'length', 'report_to': ['tensorboard', 'wandb'], 'ddp_find_unused_parameters': None, 'ddp_bucket_cap_mb': None, 'ddp_broadcast_buffers': None, 'dataloader_pin_memory': True, 'dataloader_persistent_workers': False, 'skip_memory_metrics': True, 'use_legacy_prediction_loop': False, 'push_to_hub': False, 'resume_from_checkpoint': None, 'hub_model_id': None, 'hub_strategy': 'every_save', 'hub_token': '<HUB_TOKEN>', 'hub_private_repo': False, 'hub_always_push': False, 'gradient_checkpointing': False, 'gradient_checkpointing_kwargs': None, 'include_inputs_for_metrics': False, 'fp16_backend': 'auto', 'push_to_hub_model_id': None, 'push_to_hub_organization': None, 'push_to_hub_token': '<PUSH_TO_HUB_TOKEN>', 'mp_parameters': '', 'auto_find_batch_size': False, 'full_determinism': False, 'torchdynamo': None, 'ray_scope': 'last', 'ddp_timeout': 1800, 'torch_compile': False, 'torch_compile_backend': None, 'torch_compile_mode': None, 'dispatch_batches': None, 'split_batches': None, 'include_tokens_per_second': False, 'include_num_input_tokens_seen': False, 'neftune_noise_alpha': None, 'optim_target_modules': None}
32
+ 2024-04-30 13:51:57,857 INFO MainThread:34 [jupyter.py:save_ipynb():373] not saving jupyter notebook
33
+ 2024-04-30 13:51:57,858 INFO MainThread:34 [wandb_init.py:_pause_backend():432] pausing backend
34
+ 2024-04-30 13:59:32,413 INFO MainThread:34 [wandb_init.py:_resume_backend():437] resuming backend
35
+ 2024-04-30 13:59:32,415 INFO MainThread:34 [jupyter.py:save_ipynb():373] not saving jupyter notebook
36
+ 2024-04-30 13:59:32,415 INFO MainThread:34 [wandb_init.py:_pause_backend():432] pausing backend
37
+ 2024-04-30 13:59:33,549 INFO MainThread:34 [wandb_init.py:_resume_backend():437] resuming backend
38
+ 2024-04-30 13:59:35,391 INFO MainThread:34 [jupyter.py:save_ipynb():373] not saving jupyter notebook
39
+ 2024-04-30 13:59:35,391 INFO MainThread:34 [wandb_init.py:_pause_backend():432] pausing backend
40
+ 2024-04-30 14:00:05,311 INFO MainThread:34 [wandb_init.py:_resume_backend():437] resuming backend
41
+ 2024-04-30 14:00:05,339 INFO MainThread:34 [jupyter.py:save_ipynb():373] not saving jupyter notebook
42
+ 2024-04-30 14:00:05,339 INFO MainThread:34 [wandb_init.py:_pause_backend():432] pausing backend
43
+ 2024-04-30 14:00:22,023 INFO MainThread:34 [wandb_init.py:_resume_backend():437] resuming backend
wandb/run-20240430_134330-p3yujlfg/files/conda-environment.yaml ADDED
File without changes
wandb/run-20240430_134330-p3yujlfg/files/config.yaml ADDED
@@ -0,0 +1,795 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ wandb_version: 1
2
+
3
+ _wandb:
4
+ desc: null
5
+ value:
6
+ python_version: 3.10.13
7
+ cli_version: 0.16.6
8
+ framework: huggingface
9
+ huggingface_version: 4.39.3
10
+ is_jupyter_run: true
11
+ is_kaggle_kernel: true
12
+ start_time: 1714484610.0
13
+ t:
14
+ 1:
15
+ - 1
16
+ - 2
17
+ - 3
18
+ - 5
19
+ - 11
20
+ - 12
21
+ - 49
22
+ - 51
23
+ - 53
24
+ - 55
25
+ - 71
26
+ - 98
27
+ - 105
28
+ 2:
29
+ - 1
30
+ - 2
31
+ - 3
32
+ - 5
33
+ - 11
34
+ - 12
35
+ - 49
36
+ - 51
37
+ - 53
38
+ - 55
39
+ - 71
40
+ - 98
41
+ - 105
42
+ 3:
43
+ - 7
44
+ - 23
45
+ - 62
46
+ 4: 3.10.13
47
+ 5: 0.16.6
48
+ 6: 4.39.3
49
+ 8:
50
+ - 1
51
+ - 2
52
+ - 5
53
+ 9:
54
+ 1: transformers_trainer
55
+ 13: linux-x86_64
56
+ m:
57
+ - 1: train/global_step
58
+ 6:
59
+ - 3
60
+ - 1: train/loss
61
+ 5: 1
62
+ 6:
63
+ - 1
64
+ - 1: train/grad_norm
65
+ 5: 1
66
+ 6:
67
+ - 1
68
+ - 1: train/learning_rate
69
+ 5: 1
70
+ 6:
71
+ - 1
72
+ - 1: train/epoch
73
+ 5: 1
74
+ 6:
75
+ - 1
76
+ - 1: eval/loss
77
+ 5: 1
78
+ 6:
79
+ - 1
80
+ - 1: eval/runtime
81
+ 5: 1
82
+ 6:
83
+ - 1
84
+ - 1: eval/samples_per_second
85
+ 5: 1
86
+ 6:
87
+ - 1
88
+ - 1: eval/steps_per_second
89
+ 5: 1
90
+ 6:
91
+ - 1
92
+ vocab_size:
93
+ desc: null
94
+ value: 32064
95
+ hidden_size:
96
+ desc: null
97
+ value: 3072
98
+ intermediate_size:
99
+ desc: null
100
+ value: 8192
101
+ num_hidden_layers:
102
+ desc: null
103
+ value: 32
104
+ num_attention_heads:
105
+ desc: null
106
+ value: 32
107
+ num_key_value_heads:
108
+ desc: null
109
+ value: 32
110
+ resid_pdrop:
111
+ desc: null
112
+ value: 0.0
113
+ embd_pdrop:
114
+ desc: null
115
+ value: 0.0
116
+ attention_dropout:
117
+ desc: null
118
+ value: 0.0
119
+ hidden_act:
120
+ desc: null
121
+ value: silu
122
+ max_position_embeddings:
123
+ desc: null
124
+ value: 131072
125
+ original_max_position_embeddings:
126
+ desc: null
127
+ value: 4096
128
+ initializer_range:
129
+ desc: null
130
+ value: 0.02
131
+ rms_norm_eps:
132
+ desc: null
133
+ value: 1.0e-05
134
+ use_cache:
135
+ desc: null
136
+ value: false
137
+ rope_theta:
138
+ desc: null
139
+ value: 10000.0
140
+ rope_scaling:
141
+ desc: null
142
+ value:
143
+ long_factor:
144
+ - 1.0299999713897705
145
+ - 1.0499999523162842
146
+ - 1.0499999523162842
147
+ - 1.0799999237060547
148
+ - 1.2299998998641968
149
+ - 1.2299998998641968
150
+ - 1.2999999523162842
151
+ - 1.4499999284744263
152
+ - 1.5999999046325684
153
+ - 1.6499998569488525
154
+ - 1.8999998569488525
155
+ - 2.859999895095825
156
+ - 3.68999981880188
157
+ - 5.419999599456787
158
+ - 5.489999771118164
159
+ - 5.489999771118164
160
+ - 9.09000015258789
161
+ - 11.579999923706055
162
+ - 15.65999984741211
163
+ - 15.769999504089355
164
+ - 15.789999961853027
165
+ - 18.360000610351562
166
+ - 21.989999771118164
167
+ - 23.079999923706055
168
+ - 30.009998321533203
169
+ - 32.35000228881836
170
+ - 32.590003967285156
171
+ - 35.56000518798828
172
+ - 39.95000457763672
173
+ - 53.840003967285156
174
+ - 56.20000457763672
175
+ - 57.95000457763672
176
+ - 59.29000473022461
177
+ - 59.77000427246094
178
+ - 59.920005798339844
179
+ - 61.190006256103516
180
+ - 61.96000671386719
181
+ - 62.50000762939453
182
+ - 63.3700065612793
183
+ - 63.48000717163086
184
+ - 63.48000717163086
185
+ - 63.66000747680664
186
+ - 63.850006103515625
187
+ - 64.08000946044922
188
+ - 64.760009765625
189
+ - 64.80001068115234
190
+ - 64.81001281738281
191
+ - 64.81001281738281
192
+ short_factor:
193
+ - 1.05
194
+ - 1.05
195
+ - 1.05
196
+ - 1.1
197
+ - 1.1
198
+ - 1.1500000000000001
199
+ - 1.2000000000000002
200
+ - 1.2500000000000002
201
+ - 1.3000000000000003
202
+ - 1.3500000000000003
203
+ - 1.5000000000000004
204
+ - 2.000000000000001
205
+ - 2.000000000000001
206
+ - 2.000000000000001
207
+ - 2.000000000000001
208
+ - 2.000000000000001
209
+ - 2.000000000000001
210
+ - 2.000000000000001
211
+ - 2.000000000000001
212
+ - 2.000000000000001
213
+ - 2.000000000000001
214
+ - 2.000000000000001
215
+ - 2.000000000000001
216
+ - 2.000000000000001
217
+ - 2.000000000000001
218
+ - 2.000000000000001
219
+ - 2.000000000000001
220
+ - 2.000000000000001
221
+ - 2.000000000000001
222
+ - 2.000000000000001
223
+ - 2.000000000000001
224
+ - 2.000000000000001
225
+ - 2.0500000000000007
226
+ - 2.0500000000000007
227
+ - 2.0500000000000007
228
+ - 2.1000000000000005
229
+ - 2.1000000000000005
230
+ - 2.1000000000000005
231
+ - 2.1500000000000004
232
+ - 2.1500000000000004
233
+ - 2.3499999999999996
234
+ - 2.549999999999999
235
+ - 2.5999999999999988
236
+ - 2.5999999999999988
237
+ - 2.7499999999999982
238
+ - 2.849999999999998
239
+ - 2.849999999999998
240
+ - 2.9499999999999975
241
+ type: su
242
+ sliding_window:
243
+ desc: null
244
+ value: 262144
245
+ return_dict:
246
+ desc: null
247
+ value: true
248
+ output_hidden_states:
249
+ desc: null
250
+ value: false
251
+ output_attentions:
252
+ desc: null
253
+ value: false
254
+ torchscript:
255
+ desc: null
256
+ value: false
257
+ torch_dtype:
258
+ desc: null
259
+ value: bfloat16
260
+ use_bfloat16:
261
+ desc: null
262
+ value: false
263
+ tf_legacy_loss:
264
+ desc: null
265
+ value: false
266
+ pruned_heads:
267
+ desc: null
268
+ value: {}
269
+ tie_word_embeddings:
270
+ desc: null
271
+ value: false
272
+ chunk_size_feed_forward:
273
+ desc: null
274
+ value: 0
275
+ is_encoder_decoder:
276
+ desc: null
277
+ value: false
278
+ is_decoder:
279
+ desc: null
280
+ value: false
281
+ cross_attention_hidden_size:
282
+ desc: null
283
+ value: null
284
+ add_cross_attention:
285
+ desc: null
286
+ value: false
287
+ tie_encoder_decoder:
288
+ desc: null
289
+ value: false
290
+ max_length:
291
+ desc: null
292
+ value: 20
293
+ min_length:
294
+ desc: null
295
+ value: 0
296
+ do_sample:
297
+ desc: null
298
+ value: false
299
+ early_stopping:
300
+ desc: null
301
+ value: false
302
+ num_beams:
303
+ desc: null
304
+ value: 1
305
+ num_beam_groups:
306
+ desc: null
307
+ value: 1
308
+ diversity_penalty:
309
+ desc: null
310
+ value: 0.0
311
+ temperature:
312
+ desc: null
313
+ value: 1.0
314
+ top_k:
315
+ desc: null
316
+ value: 50
317
+ top_p:
318
+ desc: null
319
+ value: 1.0
320
+ typical_p:
321
+ desc: null
322
+ value: 1.0
323
+ repetition_penalty:
324
+ desc: null
325
+ value: 1.0
326
+ length_penalty:
327
+ desc: null
328
+ value: 1.0
329
+ no_repeat_ngram_size:
330
+ desc: null
331
+ value: 0
332
+ encoder_no_repeat_ngram_size:
333
+ desc: null
334
+ value: 0
335
+ bad_words_ids:
336
+ desc: null
337
+ value: null
338
+ num_return_sequences:
339
+ desc: null
340
+ value: 1
341
+ output_scores:
342
+ desc: null
343
+ value: false
344
+ return_dict_in_generate:
345
+ desc: null
346
+ value: false
347
+ forced_bos_token_id:
348
+ desc: null
349
+ value: null
350
+ forced_eos_token_id:
351
+ desc: null
352
+ value: null
353
+ remove_invalid_values:
354
+ desc: null
355
+ value: false
356
+ exponential_decay_length_penalty:
357
+ desc: null
358
+ value: null
359
+ suppress_tokens:
360
+ desc: null
361
+ value: null
362
+ begin_suppress_tokens:
363
+ desc: null
364
+ value: null
365
+ architectures:
366
+ desc: null
367
+ value:
368
+ - Phi3ForCausalLM
369
+ finetuning_task:
370
+ desc: null
371
+ value: null
372
+ id2label:
373
+ desc: null
374
+ value:
375
+ '0': LABEL_0
376
+ '1': LABEL_1
377
+ label2id:
378
+ desc: null
379
+ value:
380
+ LABEL_0: 0
381
+ LABEL_1: 1
382
+ tokenizer_class:
383
+ desc: null
384
+ value: null
385
+ prefix:
386
+ desc: null
387
+ value: null
388
+ bos_token_id:
389
+ desc: null
390
+ value: 1
391
+ pad_token_id:
392
+ desc: null
393
+ value: 32000
394
+ eos_token_id:
395
+ desc: null
396
+ value: 32000
397
+ sep_token_id:
398
+ desc: null
399
+ value: null
400
+ decoder_start_token_id:
401
+ desc: null
402
+ value: null
403
+ task_specific_params:
404
+ desc: null
405
+ value: null
406
+ problem_type:
407
+ desc: null
408
+ value: null
409
+ _name_or_path:
410
+ desc: null
411
+ value: microsoft/Phi-3-mini-128k-instruct
412
+ transformers_version:
413
+ desc: null
414
+ value: 4.39.3
415
+ auto_map:
416
+ desc: null
417
+ value:
418
+ AutoConfig: microsoft/Phi-3-mini-128k-instruct--configuration_phi3.Phi3Config
419
+ AutoModelForCausalLM: microsoft/Phi-3-mini-128k-instruct--modeling_phi3.Phi3ForCausalLM
420
+ model_type:
421
+ desc: null
422
+ value: phi3
423
+ output_dir:
424
+ desc: null
425
+ value: /kaggle/working/
426
+ overwrite_output_dir:
427
+ desc: null
428
+ value: false
429
+ do_train:
430
+ desc: null
431
+ value: false
432
+ do_eval:
433
+ desc: null
434
+ value: true
435
+ do_predict:
436
+ desc: null
437
+ value: false
438
+ evaluation_strategy:
439
+ desc: null
440
+ value: epoch
441
+ prediction_loss_only:
442
+ desc: null
443
+ value: false
444
+ per_device_train_batch_size:
445
+ desc: null
446
+ value: 6
447
+ per_device_eval_batch_size:
448
+ desc: null
449
+ value: 6
450
+ per_gpu_train_batch_size:
451
+ desc: null
452
+ value: null
453
+ per_gpu_eval_batch_size:
454
+ desc: null
455
+ value: null
456
+ gradient_accumulation_steps:
457
+ desc: null
458
+ value: 4
459
+ eval_accumulation_steps:
460
+ desc: null
461
+ value: null
462
+ eval_delay:
463
+ desc: null
464
+ value: 0
465
+ learning_rate:
466
+ desc: null
467
+ value: 0.0002
468
+ weight_decay:
469
+ desc: null
470
+ value: 0.01
471
+ adam_beta1:
472
+ desc: null
473
+ value: 0.9
474
+ adam_beta2:
475
+ desc: null
476
+ value: 0.999
477
+ adam_epsilon:
478
+ desc: null
479
+ value: 1.0e-08
480
+ max_grad_norm:
481
+ desc: null
482
+ value: 1.0
483
+ num_train_epochs:
484
+ desc: null
485
+ value: 30
486
+ max_steps:
487
+ desc: null
488
+ value: -1
489
+ lr_scheduler_type:
490
+ desc: null
491
+ value: linear
492
+ lr_scheduler_kwargs:
493
+ desc: null
494
+ value: {}
495
+ warmup_ratio:
496
+ desc: null
497
+ value: 0.0
498
+ warmup_steps:
499
+ desc: null
500
+ value: 2
501
+ log_level:
502
+ desc: null
503
+ value: passive
504
+ log_level_replica:
505
+ desc: null
506
+ value: warning
507
+ log_on_each_node:
508
+ desc: null
509
+ value: true
510
+ logging_dir:
511
+ desc: null
512
+ value: /kaggle/working/runs/Apr30_13-43-08_fde755c1ca53
513
+ logging_strategy:
514
+ desc: null
515
+ value: epoch
516
+ logging_first_step:
517
+ desc: null
518
+ value: false
519
+ logging_steps:
520
+ desc: null
521
+ value: 500
522
+ logging_nan_inf_filter:
523
+ desc: null
524
+ value: true
525
+ save_strategy:
526
+ desc: null
527
+ value: epoch
528
+ save_steps:
529
+ desc: null
530
+ value: 500
531
+ save_total_limit:
532
+ desc: null
533
+ value: null
534
+ save_safetensors:
535
+ desc: null
536
+ value: true
537
+ save_on_each_node:
538
+ desc: null
539
+ value: false
540
+ save_only_model:
541
+ desc: null
542
+ value: false
543
+ no_cuda:
544
+ desc: null
545
+ value: false
546
+ use_cpu:
547
+ desc: null
548
+ value: false
549
+ use_mps_device:
550
+ desc: null
551
+ value: false
552
+ seed:
553
+ desc: null
554
+ value: 42
555
+ data_seed:
556
+ desc: null
557
+ value: null
558
+ jit_mode_eval:
559
+ desc: null
560
+ value: false
561
+ use_ipex:
562
+ desc: null
563
+ value: false
564
+ bf16:
565
+ desc: null
566
+ value: false
567
+ fp16:
568
+ desc: null
569
+ value: true
570
+ fp16_opt_level:
571
+ desc: null
572
+ value: O1
573
+ half_precision_backend:
574
+ desc: null
575
+ value: auto
576
+ bf16_full_eval:
577
+ desc: null
578
+ value: false
579
+ fp16_full_eval:
580
+ desc: null
581
+ value: false
582
+ tf32:
583
+ desc: null
584
+ value: null
585
+ local_rank:
586
+ desc: null
587
+ value: 0
588
+ ddp_backend:
589
+ desc: null
590
+ value: null
591
+ tpu_num_cores:
592
+ desc: null
593
+ value: null
594
+ tpu_metrics_debug:
595
+ desc: null
596
+ value: false
597
+ debug:
598
+ desc: null
599
+ value: []
600
+ dataloader_drop_last:
601
+ desc: null
602
+ value: false
603
+ eval_steps:
604
+ desc: null
605
+ value: null
606
+ dataloader_num_workers:
607
+ desc: null
608
+ value: 0
609
+ dataloader_prefetch_factor:
610
+ desc: null
611
+ value: null
612
+ past_index:
613
+ desc: null
614
+ value: -1
615
+ run_name:
616
+ desc: null
617
+ value: /kaggle/working/
618
+ disable_tqdm:
619
+ desc: null
620
+ value: false
621
+ remove_unused_columns:
622
+ desc: null
623
+ value: true
624
+ label_names:
625
+ desc: null
626
+ value: null
627
+ load_best_model_at_end:
628
+ desc: null
629
+ value: true
630
+ metric_for_best_model:
631
+ desc: null
632
+ value: loss
633
+ greater_is_better:
634
+ desc: null
635
+ value: false
636
+ ignore_data_skip:
637
+ desc: null
638
+ value: false
639
+ fsdp:
640
+ desc: null
641
+ value: []
642
+ fsdp_min_num_params:
643
+ desc: null
644
+ value: 0
645
+ fsdp_config:
646
+ desc: null
647
+ value:
648
+ min_num_params: 0
649
+ xla: false
650
+ xla_fsdp_v2: false
651
+ xla_fsdp_grad_ckpt: false
652
+ fsdp_transformer_layer_cls_to_wrap:
653
+ desc: null
654
+ value: null
655
+ accelerator_config:
656
+ desc: null
657
+ value:
658
+ split_batches: false
659
+ dispatch_batches: null
660
+ even_batches: true
661
+ use_seedable_sampler: true
662
+ deepspeed:
663
+ desc: null
664
+ value: null
665
+ label_smoothing_factor:
666
+ desc: null
667
+ value: 0.0
668
+ optim:
669
+ desc: null
670
+ value: paged_adamw_8bit
671
+ optim_args:
672
+ desc: null
673
+ value: null
674
+ adafactor:
675
+ desc: null
676
+ value: false
677
+ group_by_length:
678
+ desc: null
679
+ value: false
680
+ length_column_name:
681
+ desc: null
682
+ value: length
683
+ report_to:
684
+ desc: null
685
+ value:
686
+ - tensorboard
687
+ - wandb
688
+ ddp_find_unused_parameters:
689
+ desc: null
690
+ value: null
691
+ ddp_bucket_cap_mb:
692
+ desc: null
693
+ value: null
694
+ ddp_broadcast_buffers:
695
+ desc: null
696
+ value: null
697
+ dataloader_pin_memory:
698
+ desc: null
699
+ value: true
700
+ dataloader_persistent_workers:
701
+ desc: null
702
+ value: false
703
+ skip_memory_metrics:
704
+ desc: null
705
+ value: true
706
+ use_legacy_prediction_loop:
707
+ desc: null
708
+ value: false
709
+ push_to_hub:
710
+ desc: null
711
+ value: false
712
+ resume_from_checkpoint:
713
+ desc: null
714
+ value: null
715
+ hub_model_id:
716
+ desc: null
717
+ value: null
718
+ hub_strategy:
719
+ desc: null
720
+ value: every_save
721
+ hub_token:
722
+ desc: null
723
+ value: <HUB_TOKEN>
724
+ hub_private_repo:
725
+ desc: null
726
+ value: false
727
+ hub_always_push:
728
+ desc: null
729
+ value: false
730
+ gradient_checkpointing:
731
+ desc: null
732
+ value: false
733
+ gradient_checkpointing_kwargs:
734
+ desc: null
735
+ value: null
736
+ include_inputs_for_metrics:
737
+ desc: null
738
+ value: false
739
+ fp16_backend:
740
+ desc: null
741
+ value: auto
742
+ push_to_hub_model_id:
743
+ desc: null
744
+ value: null
745
+ push_to_hub_organization:
746
+ desc: null
747
+ value: null
748
+ push_to_hub_token:
749
+ desc: null
750
+ value: <PUSH_TO_HUB_TOKEN>
751
+ mp_parameters:
752
+ desc: null
753
+ value: ''
754
+ auto_find_batch_size:
755
+ desc: null
756
+ value: false
757
+ full_determinism:
758
+ desc: null
759
+ value: false
760
+ torchdynamo:
761
+ desc: null
762
+ value: null
763
+ ray_scope:
764
+ desc: null
765
+ value: last
766
+ ddp_timeout:
767
+ desc: null
768
+ value: 1800
769
+ torch_compile:
770
+ desc: null
771
+ value: false
772
+ torch_compile_backend:
773
+ desc: null
774
+ value: null
775
+ torch_compile_mode:
776
+ desc: null
777
+ value: null
778
+ dispatch_batches:
779
+ desc: null
780
+ value: null
781
+ split_batches:
782
+ desc: null
783
+ value: null
784
+ include_tokens_per_second:
785
+ desc: null
786
+ value: false
787
+ include_num_input_tokens_seen:
788
+ desc: null
789
+ value: false
790
+ neftune_noise_alpha:
791
+ desc: null
792
+ value: null
793
+ optim_target_modules:
794
+ desc: null
795
+ value: null
wandb/run-20240430_134330-p3yujlfg/files/requirements.txt ADDED
@@ -0,0 +1,868 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ Babel==2.14.0
2
+ Boruta==0.3
3
+ Brotli==1.0.9
4
+ CVXcanon==0.1.2
5
+ Cartopy==0.23.0
6
+ Cython==3.0.8
7
+ Deprecated==1.2.14
8
+ Farama-Notifications==0.0.4
9
+ Flask==3.0.3
10
+ Geohash==1.0
11
+ GitPython==3.1.41
12
+ ImageHash==4.3.1
13
+ Janome==0.5.0
14
+ Jinja2==3.1.2
15
+ LunarCalendar==0.0.9
16
+ Mako==1.3.3
17
+ Markdown==3.5.2
18
+ MarkupSafe==2.1.3
19
+ MarkupSafe==2.1.5
20
+ Pillow==9.5.0
21
+ PuLP==2.8.0
22
+ PyArabic==0.6.15
23
+ PyJWT==2.8.0
24
+ PyMeeus==0.5.12
25
+ PySocks==1.7.1
26
+ PyUpSet==0.1.1.post7
27
+ PyWavelets==1.5.0
28
+ PyYAML==6.0.1
29
+ Pygments==2.17.2
30
+ Pympler==1.0.1
31
+ QtPy==2.4.1
32
+ Rtree==1.2.0
33
+ SQLAlchemy==2.0.25
34
+ SecretStorage==3.3.3
35
+ Send2Trash==1.8.2
36
+ Shapely==1.8.5.post1
37
+ Shimmy==1.3.0
38
+ SimpleITK==2.3.1
39
+ TPOT==0.12.1
40
+ Theano-PyMC==1.1.2
41
+ Theano==1.0.5
42
+ Wand==0.6.13
43
+ Werkzeug==3.0.2
44
+ absl-py==1.4.0
45
+ accelerate==0.29.3
46
+ access==1.1.9
47
+ affine==2.4.0
48
+ aiobotocore==2.12.3
49
+ aiofiles==22.1.0
50
+ aiohttp-cors==0.7.0
51
+ aiohttp==3.9.1
52
+ aioitertools==0.11.0
53
+ aiorwlock==1.3.0
54
+ aiosignal==1.3.1
55
+ aiosqlite==0.19.0
56
+ albumentations==1.4.0
57
+ alembic==1.13.1
58
+ altair==5.3.0
59
+ annotated-types==0.6.0
60
+ annoy==1.17.3
61
+ anyio==4.2.0
62
+ apache-beam==2.46.0
63
+ aplus==0.11.0
64
+ appdirs==1.4.4
65
+ archspec==0.2.3
66
+ argon2-cffi-bindings==21.2.0
67
+ argon2-cffi==23.1.0
68
+ array-record==0.5.0
69
+ arrow==1.3.0
70
+ arviz==0.18.0
71
+ astroid==3.1.0
72
+ astropy-iers-data==0.2024.4.15.2.45.49
73
+ astropy==6.0.1
74
+ asttokens==2.4.1
75
+ astunparse==1.6.3
76
+ async-lru==2.0.4
77
+ async-timeout==4.0.3
78
+ attrs==23.2.0
79
+ audioread==3.0.1
80
+ auto_gptq==0.7.1
81
+ autopep8==2.0.4
82
+ backoff==2.2.1
83
+ bayesian-optimization==1.4.3
84
+ beatrix_jupyterlab==2023.128.151533
85
+ beautifulsoup4==4.12.2
86
+ bitsandbytes==0.43.1
87
+ blake3==0.2.1
88
+ bleach==6.1.0
89
+ blessed==1.20.0
90
+ blinker==1.7.0
91
+ blis==0.7.10
92
+ blosc2==2.6.2
93
+ bokeh==3.4.1
94
+ boltons==23.1.1
95
+ boto3==1.26.100
96
+ botocore==1.34.69
97
+ bq_helper==0.4.1
98
+ bqplot==0.12.43
99
+ branca==0.7.1
100
+ brewer2mpl==1.4.1
101
+ brotlipy==0.7.0
102
+ cached-property==1.5.2
103
+ cachetools==4.2.4
104
+ cachetools==5.3.2
105
+ catalogue==2.0.10
106
+ catalyst==22.4
107
+ catboost==1.2.3
108
+ category-encoders==2.6.3
109
+ certifi==2024.2.2
110
+ cesium==0.12.1
111
+ cffi==1.16.0
112
+ charset-normalizer==3.3.2
113
+ chex==0.1.86
114
+ cleverhans==4.0.0
115
+ click-plugins==1.1.1
116
+ click==8.1.7
117
+ cligj==0.7.2
118
+ cloud-tpu-client==0.10
119
+ cloud-tpu-profiler==2.4.0
120
+ cloudpathlib==0.16.0
121
+ cloudpickle==2.2.1
122
+ cloudpickle==3.0.0
123
+ cmdstanpy==1.2.2
124
+ colorama==0.4.6
125
+ colorcet==3.1.0
126
+ coloredlogs==15.0.1
127
+ colorful==0.5.6
128
+ colorlog==6.8.2
129
+ colorlover==0.3.0
130
+ comm==0.2.1
131
+ conda-libmamba-solver==23.7.0
132
+ conda-package-handling==2.2.0
133
+ conda==23.7.4
134
+ conda_package_streaming==0.9.0
135
+ confection==0.1.4
136
+ contextily==1.6.0
137
+ contourpy==1.2.0
138
+ contourpy==1.2.1
139
+ convertdate==2.4.0
140
+ crcmod==1.7
141
+ cryptography==41.0.7
142
+ cuda-python==12.4.0
143
+ cudf==23.8.0
144
+ cufflinks==0.17.3
145
+ cuml==23.8.0
146
+ cupy==13.0.0
147
+ cycler==0.12.1
148
+ cymem==2.0.8
149
+ cytoolz==0.12.3
150
+ daal4py==2024.3.0
151
+ daal==2024.3.0
152
+ dacite==1.8.1
153
+ dask-cuda==23.8.0
154
+ dask-cudf==23.8.0
155
+ dask-expr==1.0.11
156
+ dask==2024.4.1
157
+ dataclasses-json==0.6.4
158
+ dataproc_jupyter_plugin==0.1.66
159
+ datasets==2.18.0
160
+ datashader==0.16.0
161
+ datatile==1.0.3
162
+ db-dtypes==1.2.0
163
+ deap==1.4.1
164
+ debugpy==1.8.0
165
+ decorator==5.1.1
166
+ deepdiff==7.0.1
167
+ defusedxml==0.7.1
168
+ deprecation==2.1.0
169
+ descartes==1.1.0
170
+ dill==0.3.8
171
+ dipy==1.9.0
172
+ distlib==0.3.8
173
+ distributed==2023.7.1
174
+ distro==1.9.0
175
+ dm-tree==0.1.8
176
+ docker-pycreds==0.4.0
177
+ docker==7.0.0
178
+ docopt==0.6.2
179
+ docstring-parser==0.15
180
+ docstring-to-markdown==0.15
181
+ docutils==0.21.1
182
+ earthengine-api==0.1.399
183
+ easydict==1.13
184
+ easyocr==1.7.1
185
+ ecos==2.0.13
186
+ eli5==0.13.0
187
+ emoji==2.11.0
188
+ en-core-web-lg==3.7.1
189
+ en-core-web-sm==3.7.1
190
+ entrypoints==0.4
191
+ ephem==4.1.5
192
+ esda==2.5.1
193
+ essentia==2.1b6.dev1110
194
+ et-xmlfile==1.1.0
195
+ etils==1.6.0
196
+ exceptiongroup==1.2.0
197
+ executing==2.0.1
198
+ explainable-ai-sdk==1.3.3
199
+ fastai==2.7.14
200
+ fastapi==0.108.0
201
+ fastavro==1.9.3
202
+ fastcore==1.5.29
203
+ fastdownload==0.0.7
204
+ fasteners==0.19
205
+ fastjsonschema==2.19.1
206
+ fastprogress==1.0.3
207
+ fastrlock==0.8.2
208
+ fasttext==0.9.2
209
+ feather-format==0.4.1
210
+ featuretools==1.30.0
211
+ filelock==3.13.1
212
+ fiona==1.9.6
213
+ fitter==1.7.0
214
+ flake8==7.0.0
215
+ flashtext==2.7
216
+ flatbuffers==23.5.26
217
+ flax==0.8.2
218
+ folium==0.16.0
219
+ fonttools==4.47.0
220
+ fonttools==4.51.0
221
+ fqdn==1.5.1
222
+ frozendict==2.4.2
223
+ frozenlist==1.4.1
224
+ fsspec==2024.2.0
225
+ fsspec==2024.3.1
226
+ funcy==2.0
227
+ fury==0.10.0
228
+ future==1.0.0
229
+ fuzzywuzzy==0.18.0
230
+ gast==0.5.4
231
+ gatspy==0.3
232
+ gcsfs==2024.2.0
233
+ gekko==1.1.1
234
+ gensim==4.3.2
235
+ geographiclib==2.0
236
+ geojson==3.1.0
237
+ geopandas==0.14.3
238
+ geoplot==0.5.1
239
+ geopy==2.4.1
240
+ geoviews==1.12.0
241
+ ggplot==0.11.5
242
+ giddy==2.3.5
243
+ gitdb==4.0.11
244
+ google-ai-generativelanguage==0.6.2
245
+ google-api-core==2.11.1
246
+ google-api-core==2.18.0
247
+ google-api-python-client==2.126.0
248
+ google-apitools==0.5.31
249
+ google-auth-httplib2==0.2.0
250
+ google-auth-oauthlib==1.2.0
251
+ google-auth==2.26.1
252
+ google-cloud-aiplatform==0.6.0a1
253
+ google-cloud-artifact-registry==1.10.0
254
+ google-cloud-automl==1.0.1
255
+ google-cloud-bigquery==2.34.4
256
+ google-cloud-bigtable==1.7.3
257
+ google-cloud-core==2.4.1
258
+ google-cloud-datastore==2.19.0
259
+ google-cloud-dlp==3.14.0
260
+ google-cloud-jupyter-config==0.0.5
261
+ google-cloud-language==2.13.3
262
+ google-cloud-monitoring==2.18.0
263
+ google-cloud-pubsub==2.19.0
264
+ google-cloud-pubsublite==1.9.0
265
+ google-cloud-recommendations-ai==0.7.1
266
+ google-cloud-resource-manager==1.11.0
267
+ google-cloud-spanner==3.40.1
268
+ google-cloud-storage==1.44.0
269
+ google-cloud-translate==3.12.1
270
+ google-cloud-videointelligence==2.13.3
271
+ google-cloud-vision==2.8.0
272
+ google-crc32c==1.5.0
273
+ google-generativeai==0.5.1
274
+ google-pasta==0.2.0
275
+ google-resumable-media==2.7.0
276
+ googleapis-common-protos==1.62.0
277
+ gplearn==0.4.2
278
+ gpustat==1.0.0
279
+ gpxpy==1.6.2
280
+ graphviz==0.20.3
281
+ greenlet==3.0.3
282
+ grpc-google-iam-v1==0.12.7
283
+ grpcio-status==1.48.1
284
+ grpcio-status==1.48.2
285
+ grpcio==1.51.1
286
+ grpcio==1.60.0
287
+ gviz-api==1.10.0
288
+ gym-notices==0.0.8
289
+ gym==0.26.2
290
+ gymnasium==0.29.0
291
+ h11==0.14.0
292
+ h2o==3.46.0.1
293
+ h5netcdf==1.3.0
294
+ h5py==3.10.0
295
+ haversine==2.8.1
296
+ hdfs==2.7.3
297
+ hep-ml==0.7.2
298
+ hijri-converter==2.3.1
299
+ hmmlearn==0.3.2
300
+ holidays==0.24
301
+ holoviews==1.18.3
302
+ hpsklearn==0.1.0
303
+ html5lib==1.1
304
+ htmlmin==0.1.12
305
+ httpcore==1.0.5
306
+ httplib2==0.21.0
307
+ httptools==0.6.1
308
+ httpx==0.27.0
309
+ huggingface-hub==0.22.2
310
+ humanfriendly==10.0
311
+ hunspell==0.5.5
312
+ hydra-slayer==0.5.0
313
+ hyperopt==0.2.7
314
+ hypertools==0.8.0
315
+ idna==3.6
316
+ igraph==0.11.4
317
+ imagecodecs==2024.1.1
318
+ imageio==2.33.1
319
+ imbalanced-learn==0.12.2
320
+ imgaug==0.4.0
321
+ importlib-metadata==6.11.0
322
+ importlib-metadata==7.0.1
323
+ importlib-resources==6.1.1
324
+ inequality==1.0.1
325
+ iniconfig==2.0.0
326
+ ipydatawidgets==4.3.5
327
+ ipykernel==6.28.0
328
+ ipyleaflet==0.18.2
329
+ ipympl==0.7.0
330
+ ipython-genutils==0.2.0
331
+ ipython-genutils==0.2.0
332
+ ipython-sql==0.5.0
333
+ ipython==8.20.0
334
+ ipyvolume==0.6.3
335
+ ipyvue==1.11.0
336
+ ipyvuetify==1.9.4
337
+ ipywebrtc==0.6.0
338
+ ipywidgets==7.7.1
339
+ isoduration==20.11.0
340
+ isort==5.13.2
341
+ isoweek==1.3.3
342
+ itsdangerous==2.2.0
343
+ jaraco.classes==3.3.0
344
+ jax-jumpy==1.0.0
345
+ jax==0.4.23
346
+ jaxlib==0.4.23.dev20240116
347
+ jedi==0.19.1
348
+ jeepney==0.8.0
349
+ jieba==0.42.1
350
+ jmespath==1.0.1
351
+ joblib==1.4.0
352
+ json5==0.9.14
353
+ jsonpatch==1.33
354
+ jsonpointer==2.4
355
+ jsonschema-specifications==2023.12.1
356
+ jsonschema==4.20.0
357
+ jupyter-console==6.6.3
358
+ jupyter-events==0.9.0
359
+ jupyter-http-over-ws==0.0.8
360
+ jupyter-lsp==1.5.1
361
+ jupyter-server-mathjax==0.2.6
362
+ jupyter-ydoc==0.2.5
363
+ jupyter_client==7.4.9
364
+ jupyter_client==8.6.0
365
+ jupyter_core==5.7.1
366
+ jupyter_server==2.12.5
367
+ jupyter_server_fileid==0.9.1
368
+ jupyter_server_proxy==4.1.0
369
+ jupyter_server_terminals==0.5.1
370
+ jupyter_server_ydoc==0.8.0
371
+ jupyterlab-lsp==5.1.0
372
+ jupyterlab-widgets==3.0.9
373
+ jupyterlab==4.1.6
374
+ jupyterlab_git==0.44.0
375
+ jupyterlab_pygments==0.3.0
376
+ jupyterlab_server==2.25.2
377
+ jupytext==1.16.0
378
+ kaggle-environments==1.14.3
379
+ kaggle==1.6.12
380
+ kagglehub==0.2.3
381
+ keras-cv==0.8.2
382
+ keras-nlp==0.9.3
383
+ keras-tuner==1.4.6
384
+ keras==3.2.1
385
+ kernels-mixer==0.0.7
386
+ keyring==24.3.0
387
+ keyrings.google-artifactregistry-auth==1.1.2
388
+ kfp-pipeline-spec==0.2.2
389
+ kfp-server-api==2.0.5
390
+ kfp==2.5.0
391
+ kiwisolver==1.4.5
392
+ kmapper==2.0.1
393
+ kmodes==0.12.2
394
+ korean-lunar-calendar==0.3.1
395
+ kornia==0.7.2
396
+ kornia_rs==0.1.3
397
+ kt-legacy==1.0.5
398
+ kubernetes==26.1.0
399
+ langcodes==3.3.0
400
+ langid==1.1.6
401
+ lazy_loader==0.3
402
+ learntools==0.3.4
403
+ leven==1.0.4
404
+ libclang==16.0.6
405
+ libmambapy==1.5.0
406
+ libpysal==4.9.2
407
+ librosa==0.10.1
408
+ lightgbm==4.2.0
409
+ lightning-utilities==0.11.2
410
+ lime==0.2.0.1
411
+ line-profiler==4.1.2
412
+ linkify-it-py==2.0.3
413
+ llvmlite==0.41.1
414
+ llvmlite==0.42.0
415
+ lml==0.1.0
416
+ locket==1.0.0
417
+ loguru==0.7.2
418
+ lxml==5.2.1
419
+ lz4==4.3.3
420
+ mamba==1.5.0
421
+ mapclassify==2.6.1
422
+ markdown-it-py==3.0.0
423
+ marshmallow==3.21.1
424
+ matplotlib-inline==0.1.6
425
+ matplotlib-venn==0.11.10
426
+ matplotlib==3.7.5
427
+ matplotlib==3.8.4
428
+ mccabe==0.7.0
429
+ mdit-py-plugins==0.4.0
430
+ mdurl==0.1.2
431
+ memory-profiler==0.61.0
432
+ menuinst==2.0.1
433
+ mercantile==1.2.1
434
+ mgwr==2.2.1
435
+ missingno==0.5.2
436
+ mistune==0.8.4
437
+ mizani==0.11.1
438
+ ml-dtypes==0.2.0
439
+ mlcrate==0.2.0
440
+ mlens==0.2.3
441
+ mlxtend==0.23.1
442
+ mne==1.6.1
443
+ mnist==0.2.2
444
+ momepy==0.7.0
445
+ more-itertools==10.2.0
446
+ mpld3==0.5.10
447
+ mpmath==1.3.0
448
+ msgpack==1.0.7
449
+ multidict==6.0.4
450
+ multimethod==1.10
451
+ multipledispatch==1.0.0
452
+ multiprocess==0.70.16
453
+ munkres==1.1.4
454
+ murmurhash==1.0.10
455
+ mypy-extensions==1.0.0
456
+ namex==0.0.8
457
+ nb-conda-kernels==2.3.1
458
+ nb_conda==2.2.1
459
+ nbclassic==1.0.0
460
+ nbclient==0.5.13
461
+ nbconvert==6.4.5
462
+ nbdime==3.2.0
463
+ nbformat==5.9.2
464
+ ndindex==1.8
465
+ nest-asyncio==1.5.8
466
+ networkx==3.2.1
467
+ nibabel==5.2.1
468
+ nilearn==0.10.4
469
+ ninja==1.11.1.1
470
+ nltk==3.2.4
471
+ nose==1.3.7
472
+ notebook==6.5.4
473
+ notebook==6.5.6
474
+ notebook_executor==0.2
475
+ notebook_shim==0.2.3
476
+ numba==0.58.1
477
+ numba==0.59.1
478
+ numexpr==2.10.0
479
+ numpy==1.26.4
480
+ nvidia-ml-py==11.495.46
481
+ nvtx==0.2.10
482
+ oauth2client==4.1.3
483
+ oauthlib==3.2.2
484
+ objsize==0.6.1
485
+ odfpy==1.4.1
486
+ olefile==0.47
487
+ onnx==1.16.0
488
+ opencensus-context==0.1.3
489
+ opencensus==0.11.4
490
+ opencv-contrib-python==4.9.0.80
491
+ opencv-python-headless==4.9.0.80
492
+ opencv-python==4.9.0.80
493
+ openpyxl==3.1.2
494
+ openslide-python==1.3.1
495
+ opentelemetry-api==1.22.0
496
+ opentelemetry-exporter-otlp-proto-common==1.22.0
497
+ opentelemetry-exporter-otlp-proto-grpc==1.22.0
498
+ opentelemetry-exporter-otlp-proto-http==1.22.0
499
+ opentelemetry-exporter-otlp==1.22.0
500
+ opentelemetry-proto==1.22.0
501
+ opentelemetry-sdk==1.22.0
502
+ opentelemetry-semantic-conventions==0.43b0
503
+ opt-einsum==3.3.0
504
+ optax==0.2.2
505
+ optimum==1.19.1
506
+ optree==0.11.0
507
+ optuna==3.6.1
508
+ orbax-checkpoint==0.5.9
509
+ ordered-set==4.1.0
510
+ orjson==3.9.10
511
+ ortools==9.4.1874
512
+ osmnx==1.9.2
513
+ overrides==7.4.0
514
+ packaging==21.3
515
+ pandas-datareader==0.10.0
516
+ pandas-profiling==3.6.6
517
+ pandas-summary==0.2.0
518
+ pandas==2.1.4
519
+ pandas==2.2.2
520
+ pandasql==0.7.3
521
+ pandocfilters==1.5.0
522
+ panel==1.4.1
523
+ papermill==2.5.0
524
+ param==2.1.0
525
+ parso==0.8.3
526
+ partd==1.4.1
527
+ path.py==12.5.0
528
+ path==16.14.0
529
+ pathos==0.3.2
530
+ pathy==0.10.3
531
+ patsy==0.5.6
532
+ pdf2image==1.17.0
533
+ peft==0.10.0
534
+ pettingzoo==1.24.0
535
+ pexpect==4.8.0
536
+ pexpect==4.9.0
537
+ phik==0.12.4
538
+ pickleshare==0.7.5
539
+ pillow==10.3.0
540
+ pip==23.3.2
541
+ pkgutil_resolve_name==1.3.10
542
+ platformdirs==4.2.0
543
+ plotly-express==0.4.1
544
+ plotly==5.18.0
545
+ plotnine==0.13.4
546
+ pluggy==1.4.0
547
+ pointpats==2.4.0
548
+ polars==0.20.21
549
+ polyglot==16.7.4
550
+ pooch==1.8.1
551
+ pox==0.3.4
552
+ ppca==0.0.4
553
+ ppft==1.7.6.8
554
+ preprocessing==0.1.13
555
+ preshed==3.0.9
556
+ prettytable==3.9.0
557
+ progressbar2==4.4.2
558
+ prometheus-client==0.19.0
559
+ promise==2.3
560
+ prompt-toolkit==3.0.42
561
+ prompt-toolkit==3.0.43
562
+ prophet==1.1.1
563
+ proto-plus==1.23.0
564
+ protobuf==3.20.3
565
+ protobuf==4.21.12
566
+ psutil==5.9.3
567
+ psutil==5.9.7
568
+ ptyprocess==0.7.0
569
+ pudb==2024.1
570
+ pure-eval==0.2.2
571
+ py-cpuinfo==9.0.0
572
+ py-spy==0.3.14
573
+ py4j==0.10.9.7
574
+ pyLDAvis==3.4.1
575
+ pyOpenSSL==23.3.0
576
+ pyaml==23.12.0
577
+ pyarrow-hotfix==0.6
578
+ pyarrow==15.0.2
579
+ pyasn1-modules==0.3.0
580
+ pyasn1==0.5.1
581
+ pybind11==2.12.0
582
+ pyclipper==1.3.0.post5
583
+ pycodestyle==2.11.1
584
+ pycosat==0.6.6
585
+ pycparser==2.21
586
+ pycryptodome==3.20.0
587
+ pyct==0.5.0
588
+ pycuda==2024.1
589
+ pydantic==2.5.3
590
+ pydantic==2.7.0
591
+ pydantic_core==2.14.6
592
+ pydantic_core==2.18.1
593
+ pydegensac==0.1.2
594
+ pydicom==2.4.4
595
+ pydocstyle==6.3.0
596
+ pydot==1.4.2
597
+ pydub==0.25.1
598
+ pyemd==1.0.0
599
+ pyerfa==2.0.1.4
600
+ pyexcel-io==0.6.6
601
+ pyexcel-ods==0.6.0
602
+ pyflakes==3.2.0
603
+ pygltflib==1.16.2
604
+ pykalman==0.9.7
605
+ pylibraft==23.8.0
606
+ pylint==3.1.0
607
+ pymc3==3.11.4
608
+ pymongo==3.13.0
609
+ pynndescent==0.5.12
610
+ pynvml==11.4.1
611
+ pynvrtc==9.2
612
+ pyparsing==3.1.1
613
+ pyparsing==3.1.2
614
+ pypdf==4.2.0
615
+ pyproj==3.6.1
616
+ pysal==24.1
617
+ pyshp==2.3.1
618
+ pytesseract==0.3.10
619
+ pytest==8.1.1
620
+ python-bidi==0.4.2
621
+ python-dateutil==2.9.0.post0
622
+ python-dotenv==1.0.0
623
+ python-json-logger==2.0.7
624
+ python-louvain==0.16
625
+ python-lsp-jsonrpc==1.1.2
626
+ python-lsp-server==1.11.0
627
+ python-slugify==8.0.4
628
+ python-utils==3.8.2
629
+ pythreejs==2.4.2
630
+ pytoolconfig==1.3.1
631
+ pytools==2024.1.1
632
+ pytorch-ignite==0.5.0.post2
633
+ pytorch-lightning==2.2.2
634
+ pytz==2023.3.post1
635
+ pytz==2024.1
636
+ pyu2f==0.1.5
637
+ pyviz_comms==3.0.2
638
+ pyzmq==24.0.1
639
+ pyzmq==25.1.2
640
+ qgrid==1.3.1
641
+ qtconsole==5.5.1
642
+ quantecon==0.7.2
643
+ qudida==0.0.4
644
+ raft-dask==23.8.0
645
+ rasterio==1.3.10
646
+ rasterstats==0.19.0
647
+ ray-cpp==2.9.0
648
+ ray==2.9.0
649
+ referencing==0.32.1
650
+ regex==2023.12.25
651
+ requests-oauthlib==1.3.1
652
+ requests-toolbelt==0.10.1
653
+ requests==2.31.0
654
+ retrying==1.3.3
655
+ retrying==1.3.4
656
+ rfc3339-validator==0.1.4
657
+ rfc3986-validator==0.1.1
658
+ rgf-python==3.12.0
659
+ rich-click==1.7.4
660
+ rich==13.7.0
661
+ rich==13.7.1
662
+ rmm==23.8.0
663
+ rope==1.13.0
664
+ rouge==1.0.1
665
+ rpds-py==0.16.2
666
+ rsa==4.9
667
+ ruamel-yaml-conda==0.15.100
668
+ ruamel.yaml.clib==0.2.7
669
+ ruamel.yaml==0.17.40
670
+ s2sphere==0.2.5
671
+ s3fs==2024.2.0
672
+ s3transfer==0.6.2
673
+ safetensors==0.4.3
674
+ scattertext==0.1.19
675
+ scikit-image==0.22.0
676
+ scikit-learn-intelex==2024.3.0
677
+ scikit-learn==1.2.2
678
+ scikit-multilearn==0.2.0
679
+ scikit-optimize==0.10.1
680
+ scikit-plot==0.3.7
681
+ scikit-surprise==1.1.3
682
+ scipy==1.11.4
683
+ scipy==1.13.0
684
+ seaborn==0.12.2
685
+ segment_anything==1.0
686
+ segregation==2.5
687
+ semver==3.0.2
688
+ sentencepiece==0.2.0
689
+ sentry-sdk==1.45.0
690
+ setproctitle==1.3.3
691
+ setuptools-git==1.2
692
+ setuptools-scm==8.0.4
693
+ setuptools==69.0.3
694
+ shap==0.44.1
695
+ shapely==2.0.4
696
+ shellingham==1.5.4
697
+ simpervisor==1.0.0
698
+ simplejson==3.19.2
699
+ six==1.16.0
700
+ sklearn-pandas==2.2.0
701
+ slicer==0.0.7
702
+ smart-open==6.4.0
703
+ smmap==5.0.1
704
+ sniffio==1.3.0
705
+ snowballstemmer==2.2.0
706
+ snuggs==1.4.7
707
+ sortedcontainers==2.4.0
708
+ soundfile==0.12.1
709
+ soupsieve==2.5
710
+ soxr==0.3.7
711
+ spacy-legacy==3.0.12
712
+ spacy-loggers==1.0.5
713
+ spacy==3.7.3
714
+ spaghetti==1.7.5.post1
715
+ spectral==0.23.1
716
+ spglm==1.1.0
717
+ sphinx-rtd-theme==0.2.4
718
+ spint==1.0.7
719
+ splot==1.1.5.post1
720
+ spopt==0.6.0
721
+ spreg==1.4.2
722
+ spvcm==0.3.0
723
+ sqlparse==0.4.4
724
+ squarify==0.4.3
725
+ srsly==2.4.8
726
+ stable-baselines3==2.1.0
727
+ stack-data==0.6.2
728
+ stack-data==0.6.3
729
+ stanio==0.5.0
730
+ starlette==0.32.0.post1
731
+ statsmodels==0.14.1
732
+ stemming==1.0.1
733
+ stop-words==2018.7.23
734
+ stopit==1.1.2
735
+ stumpy==1.12.0
736
+ sympy==1.12
737
+ tables==3.9.2
738
+ tabulate==0.9.0
739
+ tangled-up-in-unicode==0.2.0
740
+ tbb==2021.12.0
741
+ tblib==3.0.0
742
+ tenacity==8.2.3
743
+ tensorboard-data-server==0.7.2
744
+ tensorboard-plugin-profile==2.15.0
745
+ tensorboard==2.15.1
746
+ tensorboardX==2.6.2.2
747
+ tensorflow-cloud==0.1.16
748
+ tensorflow-datasets==4.9.4
749
+ tensorflow-decision-forests==1.8.1
750
+ tensorflow-estimator==2.15.0
751
+ tensorflow-hub==0.16.1
752
+ tensorflow-io-gcs-filesystem==0.35.0
753
+ tensorflow-io==0.35.0
754
+ tensorflow-metadata==0.14.0
755
+ tensorflow-probability==0.23.0
756
+ tensorflow-serving-api==2.14.1
757
+ tensorflow-text==2.15.0
758
+ tensorflow-transform==0.14.0
759
+ tensorflow==2.15.0
760
+ tensorstore==0.1.56
761
+ termcolor==2.4.0
762
+ terminado==0.18.0
763
+ testpath==0.6.0
764
+ text-unidecode==1.3
765
+ textblob==0.18.0.post0
766
+ texttable==1.7.0
767
+ tf_keras==2.15.1
768
+ tfp-nightly==0.24.0.dev0
769
+ thinc==8.2.2
770
+ threadpoolctl==3.2.0
771
+ tifffile==2023.12.9
772
+ timm==0.9.16
773
+ tinycss2==1.2.1
774
+ tobler==0.11.2
775
+ tokenizers==0.15.2
776
+ toml==0.10.2
777
+ tomli==2.0.1
778
+ tomlkit==0.12.4
779
+ toolz==0.12.1
780
+ torch==2.1.2
781
+ torchaudio==2.1.2
782
+ torchdata==0.7.1
783
+ torchinfo==1.8.0
784
+ torchmetrics==1.3.2
785
+ torchtext==0.16.2
786
+ torchvision==0.16.2
787
+ tornado==6.3.3
788
+ tqdm==4.66.1
789
+ traceml==1.0.8
790
+ traitlets==5.9.0
791
+ traittypes==0.2.1
792
+ transformers==4.39.3
793
+ treelite-runtime==3.2.0
794
+ treelite==3.2.0
795
+ truststore==0.8.0
796
+ trx-python==0.2.9
797
+ tsfresh==0.20.2
798
+ typeguard==4.1.5
799
+ typer==0.9.0
800
+ typer==0.9.4
801
+ types-python-dateutil==2.8.19.20240106
802
+ typing-inspect==0.9.0
803
+ typing-utils==0.1.0
804
+ typing_extensions==4.9.0
805
+ tzdata==2023.4
806
+ uc-micro-py==1.0.3
807
+ ucx-py==0.33.0
808
+ ujson==5.9.0
809
+ umap-learn==0.5.6
810
+ unicodedata2==15.1.0
811
+ update-checker==0.18.0
812
+ uri-template==1.3.0
813
+ uritemplate==3.0.1
814
+ urllib3==1.26.18
815
+ urllib3==2.1.0
816
+ urwid==2.6.10
817
+ urwid_readline==0.14
818
+ uvicorn==0.25.0
819
+ uvloop==0.19.0
820
+ vaex-astro==0.9.3
821
+ vaex-core==4.17.1
822
+ vaex-hdf5==0.14.1
823
+ vaex-jupyter==0.8.2
824
+ vaex-ml==0.18.3
825
+ vaex-server==0.9.0
826
+ vaex-viz==0.5.4
827
+ vaex==4.17.0
828
+ vec_noise==1.1.4
829
+ vecstack==0.4.0
830
+ virtualenv==20.21.0
831
+ visions==0.7.5
832
+ vowpalwabbit==9.9.0
833
+ vtk==9.3.0
834
+ wandb==0.16.6
835
+ wasabi==1.1.2
836
+ watchfiles==0.21.0
837
+ wavio==0.0.8
838
+ wcwidth==0.2.13
839
+ weasel==0.3.4
840
+ webcolors==1.13
841
+ webencodings==0.5.1
842
+ websocket-client==1.7.0
843
+ websockets==12.0
844
+ wfdb==4.1.2
845
+ whatthepatch==1.0.5
846
+ wheel==0.42.0
847
+ widgetsnbextension==3.6.6
848
+ witwidget==1.8.1
849
+ woodwork==0.30.0
850
+ wordcloud==1.9.3
851
+ wordsegment==1.3.1
852
+ wrapt==1.14.1
853
+ xarray-einstats==0.7.0
854
+ xarray==2024.3.0
855
+ xgboost==2.0.3
856
+ xvfbwrapper==0.2.9
857
+ xxhash==3.4.1
858
+ xyzservices==2024.4.0
859
+ y-py==0.6.2
860
+ yapf==0.40.2
861
+ yarl==1.9.3
862
+ yarl==1.9.4
863
+ ydata-profiling==4.6.4
864
+ yellowbrick==1.5
865
+ ypy-websocket==0.8.4
866
+ zict==3.0.0
867
+ zipp==3.17.0
868
+ zstandard==0.22.0
wandb/run-20240430_134330-p3yujlfg/files/wandb-metadata.json ADDED
@@ -0,0 +1,66 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "os": "Linux-5.15.133+-x86_64-with-glibc2.31",
3
+ "python": "3.10.13",
4
+ "heartbeatAt": "2024-04-30T13:43:31.191665",
5
+ "startedAt": "2024-04-30T13:43:30.422591",
6
+ "docker": null,
7
+ "cuda": null,
8
+ "args": [],
9
+ "state": "running",
10
+ "program": "kaggle.ipynb",
11
+ "codePathLocal": null,
12
+ "root": "/kaggle/working",
13
+ "host": "fde755c1ca53",
14
+ "username": "root",
15
+ "executable": "/opt/conda/bin/python3.10",
16
+ "cpu_count": 2,
17
+ "cpu_count_logical": 4,
18
+ "cpu_freq": {
19
+ "current": 2000.144,
20
+ "min": 0.0,
21
+ "max": 0.0
22
+ },
23
+ "cpu_freq_per_core": [
24
+ {
25
+ "current": 2000.144,
26
+ "min": 0.0,
27
+ "max": 0.0
28
+ },
29
+ {
30
+ "current": 2000.144,
31
+ "min": 0.0,
32
+ "max": 0.0
33
+ },
34
+ {
35
+ "current": 2000.144,
36
+ "min": 0.0,
37
+ "max": 0.0
38
+ },
39
+ {
40
+ "current": 2000.144,
41
+ "min": 0.0,
42
+ "max": 0.0
43
+ }
44
+ ],
45
+ "disk": {
46
+ "/": {
47
+ "total": 8062.387607574463,
48
+ "used": 5605.322723388672
49
+ }
50
+ },
51
+ "gpu": "Tesla T4",
52
+ "gpu_count": 2,
53
+ "gpu_devices": [
54
+ {
55
+ "name": "Tesla T4",
56
+ "memory_total": 16106127360
57
+ },
58
+ {
59
+ "name": "Tesla T4",
60
+ "memory_total": 16106127360
61
+ }
62
+ ],
63
+ "memory": {
64
+ "total": 31.357559204101562
65
+ }
66
+ }
wandb/run-20240430_134330-p3yujlfg/files/wandb-summary.json ADDED
@@ -0,0 +1 @@
 
 
1
+ {"train/loss": 0.0494, "train/grad_norm": 0.7622084617614746, "train/learning_rate": 2.2727272727272728e-06, "train/epoch": 30.0, "train/global_step": 90, "_timestamp": 1714485117.8533459, "_runtime": 507.42275285720825, "_step": 60, "eval/loss": 0.43639254570007324, "eval/runtime": 1.3775, "eval/samples_per_second": 13.067, "eval/steps_per_second": 2.178, "train_runtime": 527.4654, "train_samples_per_second": 4.038, "train_steps_per_second": 0.171, "total_flos": 5669487917678592.0, "train_loss": 0.4407807625002331}
wandb/run-20240430_134330-p3yujlfg/logs/debug-internal.log ADDED
The diff for this file is too large to render. See raw diff
 
wandb/run-20240430_134330-p3yujlfg/logs/debug.log ADDED
@@ -0,0 +1,43 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ 2024-04-30 13:43:30,424 INFO MainThread:34 [wandb_setup.py:_flush():76] Current SDK version is 0.16.6
2
+ 2024-04-30 13:43:30,424 INFO MainThread:34 [wandb_setup.py:_flush():76] Configure stats pid to 34
3
+ 2024-04-30 13:43:30,424 INFO MainThread:34 [wandb_setup.py:_flush():76] Loading settings from /root/.config/wandb/settings
4
+ 2024-04-30 13:43:30,424 INFO MainThread:34 [wandb_setup.py:_flush():76] Loading settings from /kaggle/working/wandb/settings
5
+ 2024-04-30 13:43:30,424 INFO MainThread:34 [wandb_setup.py:_flush():76] Loading settings from environment variables: {}
6
+ 2024-04-30 13:43:30,424 INFO MainThread:34 [wandb_setup.py:_flush():76] Applying setup settings: {'_disable_service': False}
7
+ 2024-04-30 13:43:30,424 INFO MainThread:34 [wandb_setup.py:_flush():76] Inferring run settings from compute environment: {'program': '<python with no main file>'}
8
+ 2024-04-30 13:43:30,424 INFO MainThread:34 [wandb_setup.py:_flush():76] Applying login settings: {}
9
+ 2024-04-30 13:43:30,424 INFO MainThread:34 [wandb_setup.py:_flush():76] Applying login settings: {'api_key': '***REDACTED***'}
10
+ 2024-04-30 13:43:30,424 INFO MainThread:34 [wandb_init.py:_log_setup():521] Logging user logs to /kaggle/working/wandb/run-20240430_134330-p3yujlfg/logs/debug.log
11
+ 2024-04-30 13:43:30,424 INFO MainThread:34 [wandb_init.py:_log_setup():522] Logging internal logs to /kaggle/working/wandb/run-20240430_134330-p3yujlfg/logs/debug-internal.log
12
+ 2024-04-30 13:43:30,425 INFO MainThread:34 [wandb_init.py:_jupyter_setup():467] configuring jupyter hooks <wandb.sdk.wandb_init._WandbInit object at 0x781a89bef460>
13
+ 2024-04-30 13:43:30,425 INFO MainThread:34 [wandb_init.py:init():561] calling init triggers
14
+ 2024-04-30 13:43:30,425 INFO MainThread:34 [wandb_init.py:init():568] wandb.init called with sweep_config: {}
15
+ config: {}
16
+ 2024-04-30 13:43:30,425 INFO MainThread:34 [wandb_init.py:init():611] starting backend
17
+ 2024-04-30 13:43:30,425 INFO MainThread:34 [wandb_init.py:init():615] setting up manager
18
+ 2024-04-30 13:43:30,427 INFO MainThread:34 [backend.py:_multiprocessing_setup():105] multiprocessing start_methods=fork,spawn,forkserver, using: spawn
19
+ 2024-04-30 13:43:30,430 INFO MainThread:34 [wandb_init.py:init():623] backend started and connected
20
+ 2024-04-30 13:43:30,442 INFO MainThread:34 [wandb_run.py:_label_probe_notebook():1299] probe notebook
21
+ 2024-04-30 13:43:30,930 INFO MainThread:34 [wandb_init.py:init():715] updated telemetry
22
+ 2024-04-30 13:43:30,933 INFO MainThread:34 [wandb_init.py:init():748] communicating run to backend with 90.0 second timeout
23
+ 2024-04-30 13:43:31,069 INFO MainThread:34 [wandb_run.py:_on_init():2357] communicating current version
24
+ 2024-04-30 13:43:31,155 INFO MainThread:34 [wandb_run.py:_on_init():2366] got version response
25
+ 2024-04-30 13:43:31,157 INFO MainThread:34 [wandb_init.py:init():799] starting run threads in backend
26
+ 2024-04-30 13:43:47,243 INFO MainThread:34 [wandb_run.py:_console_start():2335] atexit reg
27
+ 2024-04-30 13:43:47,244 INFO MainThread:34 [wandb_run.py:_redirect():2190] redirect: wrap_raw
28
+ 2024-04-30 13:43:47,245 INFO MainThread:34 [wandb_run.py:_redirect():2255] Wrapping output streams.
29
+ 2024-04-30 13:43:47,245 INFO MainThread:34 [wandb_run.py:_redirect():2280] Redirects installed.
30
+ 2024-04-30 13:43:47,246 INFO MainThread:34 [wandb_init.py:init():842] run started, returning control to user process
31
+ 2024-04-30 13:43:47,253 INFO MainThread:34 [wandb_run.py:_config_callback():1347] config_cb None None {'vocab_size': 32064, 'hidden_size': 3072, 'intermediate_size': 8192, 'num_hidden_layers': 32, 'num_attention_heads': 32, 'num_key_value_heads': 32, 'resid_pdrop': 0.0, 'embd_pdrop': 0.0, 'attention_dropout': 0.0, 'hidden_act': 'silu', 'max_position_embeddings': 131072, 'original_max_position_embeddings': 4096, 'initializer_range': 0.02, 'rms_norm_eps': 1e-05, 'use_cache': False, 'rope_theta': 10000.0, 'rope_scaling': {'long_factor': [1.0299999713897705, 1.0499999523162842, 1.0499999523162842, 1.0799999237060547, 1.2299998998641968, 1.2299998998641968, 1.2999999523162842, 1.4499999284744263, 1.5999999046325684, 1.6499998569488525, 1.8999998569488525, 2.859999895095825, 3.68999981880188, 5.419999599456787, 5.489999771118164, 5.489999771118164, 9.09000015258789, 11.579999923706055, 15.65999984741211, 15.769999504089355, 15.789999961853027, 18.360000610351562, 21.989999771118164, 23.079999923706055, 30.009998321533203, 32.35000228881836, 32.590003967285156, 35.56000518798828, 39.95000457763672, 53.840003967285156, 56.20000457763672, 57.95000457763672, 59.29000473022461, 59.77000427246094, 59.920005798339844, 61.190006256103516, 61.96000671386719, 62.50000762939453, 63.3700065612793, 63.48000717163086, 63.48000717163086, 63.66000747680664, 63.850006103515625, 64.08000946044922, 64.760009765625, 64.80001068115234, 64.81001281738281, 64.81001281738281], 'short_factor': [1.05, 1.05, 1.05, 1.1, 1.1, 1.1500000000000001, 1.2000000000000002, 1.2500000000000002, 1.3000000000000003, 1.3500000000000003, 1.5000000000000004, 2.000000000000001, 2.000000000000001, 2.000000000000001, 2.000000000000001, 2.000000000000001, 2.000000000000001, 2.000000000000001, 2.000000000000001, 2.000000000000001, 2.000000000000001, 2.000000000000001, 2.000000000000001, 2.000000000000001, 2.000000000000001, 2.000000000000001, 2.000000000000001, 2.000000000000001, 2.000000000000001, 2.000000000000001, 2.000000000000001, 2.000000000000001, 2.0500000000000007, 2.0500000000000007, 2.0500000000000007, 2.1000000000000005, 2.1000000000000005, 2.1000000000000005, 2.1500000000000004, 2.1500000000000004, 2.3499999999999996, 2.549999999999999, 2.5999999999999988, 2.5999999999999988, 2.7499999999999982, 2.849999999999998, 2.849999999999998, 2.9499999999999975], 'type': 'su'}, 'sliding_window': 262144, 'return_dict': True, 'output_hidden_states': False, 'output_attentions': False, 'torchscript': False, 'torch_dtype': 'bfloat16', 'use_bfloat16': False, 'tf_legacy_loss': False, 'pruned_heads': {}, 'tie_word_embeddings': False, 'chunk_size_feed_forward': 0, 'is_encoder_decoder': False, 'is_decoder': False, 'cross_attention_hidden_size': None, 'add_cross_attention': False, 'tie_encoder_decoder': False, 'max_length': 20, 'min_length': 0, 'do_sample': False, 'early_stopping': False, 'num_beams': 1, 'num_beam_groups': 1, 'diversity_penalty': 0.0, 'temperature': 1.0, 'top_k': 50, 'top_p': 1.0, 'typical_p': 1.0, 'repetition_penalty': 1.0, 'length_penalty': 1.0, 'no_repeat_ngram_size': 0, 'encoder_no_repeat_ngram_size': 0, 'bad_words_ids': None, 'num_return_sequences': 1, 'output_scores': False, 'return_dict_in_generate': False, 'forced_bos_token_id': None, 'forced_eos_token_id': None, 'remove_invalid_values': False, 'exponential_decay_length_penalty': None, 'suppress_tokens': None, 'begin_suppress_tokens': None, 'architectures': ['Phi3ForCausalLM'], 'finetuning_task': None, 'id2label': {0: 'LABEL_0', 1: 'LABEL_1'}, 'label2id': {'LABEL_0': 0, 'LABEL_1': 1}, 'tokenizer_class': None, 'prefix': None, 'bos_token_id': 1, 'pad_token_id': 32000, 'eos_token_id': 32000, 'sep_token_id': None, 'decoder_start_token_id': None, 'task_specific_params': None, 'problem_type': None, '_name_or_path': 'microsoft/Phi-3-mini-128k-instruct', 'transformers_version': '4.39.3', 'auto_map': {'AutoConfig': 'microsoft/Phi-3-mini-128k-instruct--configuration_phi3.Phi3Config', 'AutoModelForCausalLM': 'microsoft/Phi-3-mini-128k-instruct--modeling_phi3.Phi3ForCausalLM'}, 'model_type': 'phi3', 'output_dir': '/kaggle/working/', 'overwrite_output_dir': False, 'do_train': False, 'do_eval': True, 'do_predict': False, 'evaluation_strategy': 'epoch', 'prediction_loss_only': False, 'per_device_train_batch_size': 6, 'per_device_eval_batch_size': 6, 'per_gpu_train_batch_size': None, 'per_gpu_eval_batch_size': None, 'gradient_accumulation_steps': 4, 'eval_accumulation_steps': None, 'eval_delay': 0, 'learning_rate': 0.0002, 'weight_decay': 0.01, 'adam_beta1': 0.9, 'adam_beta2': 0.999, 'adam_epsilon': 1e-08, 'max_grad_norm': 1.0, 'num_train_epochs': 30, 'max_steps': -1, 'lr_scheduler_type': 'linear', 'lr_scheduler_kwargs': {}, 'warmup_ratio': 0.0, 'warmup_steps': 2, 'log_level': 'passive', 'log_level_replica': 'warning', 'log_on_each_node': True, 'logging_dir': '/kaggle/working/runs/Apr30_13-43-08_fde755c1ca53', 'logging_strategy': 'epoch', 'logging_first_step': False, 'logging_steps': 500, 'logging_nan_inf_filter': True, 'save_strategy': 'epoch', 'save_steps': 500, 'save_total_limit': None, 'save_safetensors': True, 'save_on_each_node': False, 'save_only_model': False, 'no_cuda': False, 'use_cpu': False, 'use_mps_device': False, 'seed': 42, 'data_seed': None, 'jit_mode_eval': False, 'use_ipex': False, 'bf16': False, 'fp16': True, 'fp16_opt_level': 'O1', 'half_precision_backend': 'auto', 'bf16_full_eval': False, 'fp16_full_eval': False, 'tf32': None, 'local_rank': 0, 'ddp_backend': None, 'tpu_num_cores': None, 'tpu_metrics_debug': False, 'debug': [], 'dataloader_drop_last': False, 'eval_steps': None, 'dataloader_num_workers': 0, 'dataloader_prefetch_factor': None, 'past_index': -1, 'run_name': '/kaggle/working/', 'disable_tqdm': False, 'remove_unused_columns': True, 'label_names': None, 'load_best_model_at_end': True, 'metric_for_best_model': 'loss', 'greater_is_better': False, 'ignore_data_skip': False, 'fsdp': [], 'fsdp_min_num_params': 0, 'fsdp_config': {'min_num_params': 0, 'xla': False, 'xla_fsdp_v2': False, 'xla_fsdp_grad_ckpt': False}, 'fsdp_transformer_layer_cls_to_wrap': None, 'accelerator_config': {'split_batches': False, 'dispatch_batches': None, 'even_batches': True, 'use_seedable_sampler': True}, 'deepspeed': None, 'label_smoothing_factor': 0.0, 'optim': 'paged_adamw_8bit', 'optim_args': None, 'adafactor': False, 'group_by_length': False, 'length_column_name': 'length', 'report_to': ['tensorboard', 'wandb'], 'ddp_find_unused_parameters': None, 'ddp_bucket_cap_mb': None, 'ddp_broadcast_buffers': None, 'dataloader_pin_memory': True, 'dataloader_persistent_workers': False, 'skip_memory_metrics': True, 'use_legacy_prediction_loop': False, 'push_to_hub': False, 'resume_from_checkpoint': None, 'hub_model_id': None, 'hub_strategy': 'every_save', 'hub_token': '<HUB_TOKEN>', 'hub_private_repo': False, 'hub_always_push': False, 'gradient_checkpointing': False, 'gradient_checkpointing_kwargs': None, 'include_inputs_for_metrics': False, 'fp16_backend': 'auto', 'push_to_hub_model_id': None, 'push_to_hub_organization': None, 'push_to_hub_token': '<PUSH_TO_HUB_TOKEN>', 'mp_parameters': '', 'auto_find_batch_size': False, 'full_determinism': False, 'torchdynamo': None, 'ray_scope': 'last', 'ddp_timeout': 1800, 'torch_compile': False, 'torch_compile_backend': None, 'torch_compile_mode': None, 'dispatch_batches': None, 'split_batches': None, 'include_tokens_per_second': False, 'include_num_input_tokens_seen': False, 'neftune_noise_alpha': None, 'optim_target_modules': None}
32
+ 2024-04-30 13:51:57,857 INFO MainThread:34 [jupyter.py:save_ipynb():373] not saving jupyter notebook
33
+ 2024-04-30 13:51:57,858 INFO MainThread:34 [wandb_init.py:_pause_backend():432] pausing backend
34
+ 2024-04-30 13:59:32,413 INFO MainThread:34 [wandb_init.py:_resume_backend():437] resuming backend
35
+ 2024-04-30 13:59:32,415 INFO MainThread:34 [jupyter.py:save_ipynb():373] not saving jupyter notebook
36
+ 2024-04-30 13:59:32,415 INFO MainThread:34 [wandb_init.py:_pause_backend():432] pausing backend
37
+ 2024-04-30 13:59:33,549 INFO MainThread:34 [wandb_init.py:_resume_backend():437] resuming backend
38
+ 2024-04-30 13:59:35,391 INFO MainThread:34 [jupyter.py:save_ipynb():373] not saving jupyter notebook
39
+ 2024-04-30 13:59:35,391 INFO MainThread:34 [wandb_init.py:_pause_backend():432] pausing backend
40
+ 2024-04-30 14:00:05,311 INFO MainThread:34 [wandb_init.py:_resume_backend():437] resuming backend
41
+ 2024-04-30 14:00:05,339 INFO MainThread:34 [jupyter.py:save_ipynb():373] not saving jupyter notebook
42
+ 2024-04-30 14:00:05,339 INFO MainThread:34 [wandb_init.py:_pause_backend():432] pausing backend
43
+ 2024-04-30 14:00:22,023 INFO MainThread:34 [wandb_init.py:_resume_backend():437] resuming backend
wandb/run-20240430_134330-p3yujlfg/run-p3yujlfg.wandb ADDED
Binary file (37.1 kB). View file