gemma-7b-sql / nemo /nemo_error_log.txt
samos123's picture
Upload folder using huggingface_hub
bfc8406 verified
raw
history blame
8.98 kB
[NeMo W 2024-03-18 05:24:26 nemo_logging:349] /usr/local/lib/python3.10/dist-packages/hydra/_internal/hydra.py:119: UserWarning: Future Hydra versions will no longer change working directory at job runtime by default.
See https://hydra.cc/docs/1.2/upgrades/1.1_to_1.2/changes_to_job_working_dir/ for more information.
ret = run_job(
[NeMo W 2024-03-18 05:24:26 nemo_logging:349] /usr/local/lib/python3.10/dist-packages/lightning_fabric/connector.py:554: UserWarning: bf16 is supported for historical reasons but its usage is discouraged. Please set your precision to bf16-mixed instead!
rank_zero_warn(
[NeMo W 2024-03-18 05:24:26 exp_manager:708] Exp_manager is logging to models/gemma-7b-sql-nemo, but it already exists.
[NeMo W 2024-03-18 05:24:26 exp_manager:630] There were no checkpoints found in checkpoint_dir or no checkpoint folder at checkpoint_dir :models/gemma-7b-sql-nemo/checkpoints. Training from scratch.
[NeMo W 2024-03-18 05:24:54 megatron_base_model:1078] The model: GPTSFTModel() does not have field.name: context_parallel_size in its cfg. Add this key to cfg or config_mapping to make to make it configurable.
[NeMo W 2024-03-18 05:24:54 megatron_base_model:1078] The model: GPTSFTModel() does not have field.name: expert_model_parallel_size in its cfg. Add this key to cfg or config_mapping to make to make it configurable.
[NeMo W 2024-03-18 05:24:54 megatron_base_model:1078] The model: GPTSFTModel() does not have field.name: tp_comm_overlap in its cfg. Add this key to cfg or config_mapping to make to make it configurable.
[NeMo W 2024-03-18 05:24:54 megatron_base_model:1078] The model: GPTSFTModel() does not have field.name: tp_comm_split_ag in its cfg. Add this key to cfg or config_mapping to make to make it configurable.
[NeMo W 2024-03-18 05:24:54 megatron_base_model:1078] The model: GPTSFTModel() does not have field.name: tp_comm_split_rs in its cfg. Add this key to cfg or config_mapping to make to make it configurable.
[NeMo W 2024-03-18 05:24:54 megatron_base_model:1078] The model: GPTSFTModel() does not have field.name: tp_comm_bulk_wgrad in its cfg. Add this key to cfg or config_mapping to make to make it configurable.
[NeMo W 2024-03-18 05:24:54 megatron_base_model:1078] The model: GPTSFTModel() does not have field.name: tp_comm_bulk_dgrad in its cfg. Add this key to cfg or config_mapping to make to make it configurable.
[NeMo W 2024-03-18 05:24:54 megatron_base_model:1078] The model: GPTSFTModel() does not have field.name: finalize_model_grads_func in its cfg. Add this key to cfg or config_mapping to make to make it configurable.
[NeMo W 2024-03-18 05:24:54 megatron_base_model:1078] The model: GPTSFTModel() does not have field.name: barrier_with_L1_time in its cfg. Add this key to cfg or config_mapping to make to make it configurable.
[NeMo W 2024-03-18 05:24:54 megatron_base_model:1078] The model: GPTSFTModel() does not have field.name: context_parallel_size in its cfg. Add this key to cfg or config_mapping to make to make it configurable.
[NeMo W 2024-03-18 05:24:54 megatron_base_model:1078] The model: GPTSFTModel() does not have field.name: expert_model_parallel_size in its cfg. Add this key to cfg or config_mapping to make to make it configurable.
[NeMo W 2024-03-18 05:24:54 megatron_base_model:1078] The model: GPTSFTModel() does not have field.name: tp_comm_overlap in its cfg. Add this key to cfg or config_mapping to make to make it configurable.
[NeMo W 2024-03-18 05:24:54 megatron_base_model:1078] The model: GPTSFTModel() does not have field.name: tp_comm_split_ag in its cfg. Add this key to cfg or config_mapping to make to make it configurable.
[NeMo W 2024-03-18 05:24:54 megatron_base_model:1078] The model: GPTSFTModel() does not have field.name: tp_comm_split_rs in its cfg. Add this key to cfg or config_mapping to make to make it configurable.
[NeMo W 2024-03-18 05:24:54 megatron_base_model:1078] The model: GPTSFTModel() does not have field.name: tp_comm_bulk_wgrad in its cfg. Add this key to cfg or config_mapping to make to make it configurable.
[NeMo W 2024-03-18 05:24:54 megatron_base_model:1078] The model: GPTSFTModel() does not have field.name: tp_comm_bulk_dgrad in its cfg. Add this key to cfg or config_mapping to make to make it configurable.
[NeMo W 2024-03-18 05:24:54 megatron_base_model:1078] The model: GPTSFTModel() does not have field.name: finalize_model_grads_func in its cfg. Add this key to cfg or config_mapping to make to make it configurable.
[NeMo W 2024-03-18 05:24:54 megatron_base_model:1078] The model: GPTSFTModel() does not have field.name: barrier_with_L1_time in its cfg. Add this key to cfg or config_mapping to make to make it configurable.
[NeMo W 2024-03-18 05:24:54 megatron_base_model:1078] The model: GPTSFTModel() does not have field.name: context_parallel_size in its cfg. Add this key to cfg or config_mapping to make to make it configurable.
[NeMo W 2024-03-18 05:24:54 megatron_base_model:1078] The model: GPTSFTModel() does not have field.name: expert_model_parallel_size in its cfg. Add this key to cfg or config_mapping to make to make it configurable.
[NeMo W 2024-03-18 05:24:54 megatron_base_model:1078] The model: GPTSFTModel() does not have field.name: tp_comm_overlap in its cfg. Add this key to cfg or config_mapping to make to make it configurable.
[NeMo W 2024-03-18 05:24:54 megatron_base_model:1078] The model: GPTSFTModel() does not have field.name: tp_comm_split_ag in its cfg. Add this key to cfg or config_mapping to make to make it configurable.
[NeMo W 2024-03-18 05:24:54 megatron_base_model:1078] The model: GPTSFTModel() does not have field.name: tp_comm_split_rs in its cfg. Add this key to cfg or config_mapping to make to make it configurable.
[NeMo W 2024-03-18 05:24:54 megatron_base_model:1078] The model: GPTSFTModel() does not have field.name: tp_comm_bulk_wgrad in its cfg. Add this key to cfg or config_mapping to make to make it configurable.
[NeMo W 2024-03-18 05:24:54 megatron_base_model:1078] The model: GPTSFTModel() does not have field.name: tp_comm_bulk_dgrad in its cfg. Add this key to cfg or config_mapping to make to make it configurable.
[NeMo W 2024-03-18 05:24:54 megatron_base_model:1078] The model: GPTSFTModel() does not have field.name: finalize_model_grads_func in its cfg. Add this key to cfg or config_mapping to make to make it configurable.
[NeMo W 2024-03-18 05:24:54 megatron_base_model:1078] The model: GPTSFTModel() does not have field.name: barrier_with_L1_time in its cfg. Add this key to cfg or config_mapping to make to make it configurable.
[NeMo W 2024-03-18 05:24:54 megatron_base_model:492] The model: GPTSFTModel() does not have field.name: num_moe_experts in its cfg. Add this key to cfg or config_mapping to make to make it configurable.
[NeMo W 2024-03-18 05:24:54 megatron_base_model:492] The model: GPTSFTModel() does not have field.name: bias_gelu_fusion in its cfg. Add this key to cfg or config_mapping to make to make it configurable.
[NeMo W 2024-03-18 05:24:54 megatron_base_model:492] The model: GPTSFTModel() does not have field.name: fp8_wgrad in its cfg. Add this key to cfg or config_mapping to make to make it configurable.
[NeMo W 2024-03-18 05:24:54 megatron_base_model:492] The model: GPTSFTModel() does not have field.name: clone_scatter_output_in_embedding in its cfg. Add this key to cfg or config_mapping to make to make it configurable.
[NeMo W 2024-03-18 05:24:55 nemo_logging:349] /usr/local/lib/python3.10/dist-packages/transformer_engine/pytorch/module/base.py:611: UserWarning: To guarantee overlapping TP and SP collectives with the backwardGEMMs, set environment variable CUDA_DEVICE_MAX_CONNECTIONS = 1
warnings.warn(
[NeMo W 2024-03-18 05:27:34 experimental:26] `<class 'nemo.collections.nlp.data.language_modeling.megatron.megatron_batch_samplers.MegatronPretrainingRandomBatchSampler'>` is experimental and not ready for production yet. Use at your own risk.
[NeMo W 2024-03-18 05:27:34 experimental:26] `<class 'nemo.collections.nlp.data.language_modeling.megatron.megatron_batch_samplers.MegatronPretrainingRandomBatchSampler'>` is experimental and not ready for production yet. Use at your own risk.
[NeMo W 2024-03-18 07:47:22 nemo_logging:349] /usr/local/lib/python3.10/dist-packages/apex/transformer/pipeline_parallel/utils.py:81: UserWarning: This function is only for unittest
warnings.warn("This function is only for unittest")
[NeMo W 2024-03-18 07:53:39 nemo_logging:349] /usr/local/lib/python3.10/dist-packages/pytorch_lightning/callbacks/model_checkpoint.py:359: UserWarning: `ModelCheckpoint(monitor='validation_loss')` could not find the monitored key in the returned metrics: ['step', 'consumed_samples', 'epoch', 'train_grad_norm', 'train_lr', 'train_loss', 'train_consumed_samples', 'train_step_time', 'train_epoch', 'val_loss', 'val_validation_step_time']. HINT: Did you call `log('validation_loss', value)` in the `LightningModule`?
warning_cache.warn(m)