PMC_LLAMA_7B_trainer_Wiki_lora / trainer_peft.log
Jingmei's picture
Training in progress, step 610
58b461f verified
raw
history blame contribute delete
No virus
11.9 kB
2024-05-29 18:45 - Cuda check
2024-05-29 18:45 - True
2024-05-29 18:45 - 1
2024-05-29 18:45 - Configue Model and tokenizer
2024-05-29 18:45 - Memory usage in 0.00 GB
2024-05-29 18:45 - Dataset loaded successfully:
train-Jingmei/Pandemic_Wiki
test -Jingmei/Pandemic_WHO
2024-05-29 18:46 - Tokenize data: DatasetDict({
train: Dataset({
features: ['input_ids', 'attention_mask'],
num_rows: 2152
})
test: Dataset({
features: ['input_ids', 'attention_mask'],
num_rows: 8264
})
})
2024-05-29 18:49 - Split data into chunks:DatasetDict({
train: Dataset({
features: ['input_ids', 'attention_mask'],
num_rows: 24863
})
test: Dataset({
features: ['input_ids', 'attention_mask'],
num_rows: 198964
})
})
2024-05-29 18:49 - Setup PEFT
2024-05-29 18:49 - Setup optimizer
2024-05-29 18:49 - Start training
2024-05-29 18:57 - Cuda check
2024-05-29 18:57 - True
2024-05-29 18:57 - 1
2024-05-29 18:57 - Configue Model and tokenizer
2024-05-29 18:57 - Memory usage in 25.17 GB
2024-05-29 18:57 - Dataset loaded successfully:
train-Jingmei/Pandemic_Wiki
test -Jingmei/Pandemic_WHO
2024-05-29 18:57 - Tokenize data: DatasetDict({
train: Dataset({
features: ['input_ids', 'attention_mask'],
num_rows: 2152
})
test: Dataset({
features: ['input_ids', 'attention_mask'],
num_rows: 8264
})
})
2024-05-29 18:57 - Split data into chunks:DatasetDict({
train: Dataset({
features: ['input_ids', 'attention_mask'],
num_rows: 24863
})
test: Dataset({
features: ['input_ids', 'attention_mask'],
num_rows: 198964
})
})
2024-05-29 18:57 - Setup PEFT
2024-05-29 18:57 - Setup optimizer
2024-05-29 18:57 - Start training
2024-05-29 19:04 - Cuda check
2024-05-29 19:04 - True
2024-05-29 19:04 - 1
2024-05-29 19:04 - Configue Model and tokenizer
2024-05-29 19:04 - Memory usage in 25.17 GB
2024-05-29 19:04 - Dataset loaded successfully:
train-Jingmei/Pandemic_Wiki
test -Jingmei/Pandemic_WHO
2024-05-29 19:04 - Tokenize data: DatasetDict({
train: Dataset({
features: ['input_ids', 'attention_mask'],
num_rows: 2152
})
test: Dataset({
features: ['input_ids', 'attention_mask'],
num_rows: 8264
})
})
2024-05-29 19:04 - Split data into chunks:DatasetDict({
train: Dataset({
features: ['input_ids', 'attention_mask'],
num_rows: 24863
})
test: Dataset({
features: ['input_ids', 'attention_mask'],
num_rows: 198964
})
})
2024-05-29 19:04 - Setup PEFT
2024-05-29 19:04 - Setup optimizer
2024-05-29 19:04 - Start training
2024-05-29 19:10 - Cuda check
2024-05-29 19:10 - True
2024-05-29 19:10 - 1
2024-05-29 19:10 - Configue Model and tokenizer
2024-05-29 19:10 - Memory usage in 25.17 GB
2024-05-29 19:10 - Dataset loaded successfully:
train-Jingmei/Pandemic_Wiki
test -Jingmei/Pandemic_WHO
2024-05-29 19:10 - Tokenize data: DatasetDict({
train: Dataset({
features: ['input_ids', 'attention_mask'],
num_rows: 2152
})
test: Dataset({
features: ['input_ids', 'attention_mask'],
num_rows: 8264
})
})
2024-05-29 19:10 - Split data into chunks:DatasetDict({
train: Dataset({
features: ['input_ids', 'attention_mask'],
num_rows: 24863
})
test: Dataset({
features: ['input_ids', 'attention_mask'],
num_rows: 198964
})
})
2024-05-29 19:10 - Setup PEFT
2024-05-29 19:10 - Setup optimizer
2024-05-29 19:10 - Start training
2024-05-29 19:16 - Cuda check
2024-05-29 19:16 - True
2024-05-29 19:16 - 1
2024-05-29 19:16 - Configue Model and tokenizer
2024-05-29 19:16 - Memory usage in 25.17 GB
2024-05-29 19:16 - Dataset loaded successfully:
train-Jingmei/Pandemic_Wiki
test -Jingmei/Pandemic_WHO
2024-05-29 19:16 - Tokenize data: DatasetDict({
train: Dataset({
features: ['input_ids', 'attention_mask'],
num_rows: 2152
})
test: Dataset({
features: ['input_ids', 'attention_mask'],
num_rows: 8264
})
})
2024-05-29 19:16 - Split data into chunks:DatasetDict({
train: Dataset({
features: ['input_ids', 'attention_mask'],
num_rows: 24863
})
test: Dataset({
features: ['input_ids', 'attention_mask'],
num_rows: 198964
})
})
2024-05-29 19:16 - Setup PEFT
2024-05-29 19:16 - Setup optimizer
2024-05-29 19:16 - Start training
2024-05-29 19:22 - Cuda check
2024-05-29 19:22 - True
2024-05-29 19:22 - 1
2024-05-29 19:22 - Configue Model and tokenizer
2024-05-29 19:22 - Memory usage in 25.17 GB
2024-05-29 19:22 - Dataset loaded successfully:
train-Jingmei/Pandemic_Wiki
test -Jingmei/Pandemic_WHO
2024-05-29 19:22 - Tokenize data: DatasetDict({
train: Dataset({
features: ['input_ids', 'attention_mask'],
num_rows: 2152
})
test: Dataset({
features: ['input_ids', 'attention_mask'],
num_rows: 8264
})
})
2024-05-29 19:22 - Split data into chunks:DatasetDict({
train: Dataset({
features: ['input_ids', 'attention_mask'],
num_rows: 24863
})
test: Dataset({
features: ['input_ids', 'attention_mask'],
num_rows: 198964
})
})
2024-05-29 19:22 - Setup PEFT
2024-05-29 19:22 - Setup optimizer
2024-05-29 19:22 - Start training
2024-05-29 19:29 - Cuda check
2024-05-29 19:29 - True
2024-05-29 19:29 - 1
2024-05-29 19:29 - Configue Model and tokenizer
2024-05-29 19:29 - Memory usage in 25.17 GB
2024-05-29 19:29 - Dataset loaded successfully:
train-Jingmei/Pandemic_Wiki
test -Jingmei/Pandemic_WHO
2024-05-29 19:29 - Tokenize data: DatasetDict({
train: Dataset({
features: ['input_ids', 'attention_mask'],
num_rows: 2152
})
test: Dataset({
features: ['input_ids', 'attention_mask'],
num_rows: 8264
})
})
2024-05-29 19:29 - Split data into chunks:DatasetDict({
train: Dataset({
features: ['input_ids', 'attention_mask'],
num_rows: 24863
})
test: Dataset({
features: ['input_ids', 'attention_mask'],
num_rows: 198964
})
})
2024-05-29 19:29 - Setup PEFT
2024-05-29 19:29 - Setup optimizer
2024-05-29 19:29 - Start training
2024-05-29 22:44 - Training complete!!!
2024-05-29 23:26 - Cuda check
2024-05-29 23:26 - True
2024-05-29 23:26 - 1
2024-05-29 23:26 - Configue Model and tokenizer
2024-05-29 23:26 - Memory usage in 25.17 GB
2024-05-29 23:26 - Dataset loaded successfully:
train-Jingmei/Pandemic_CDC
test -Jingmei/Pandemic_WHO
2024-05-29 23:26 - Tokenize data: DatasetDict({
train: Dataset({
features: ['input_ids', 'attention_mask'],
num_rows: 2152
})
test: Dataset({
features: ['input_ids', 'attention_mask'],
num_rows: 8264
})
})
2024-05-29 23:26 - Split data into chunks:DatasetDict({
train: Dataset({
features: ['input_ids', 'attention_mask'],
num_rows: 24863
})
test: Dataset({
features: ['input_ids', 'attention_mask'],
num_rows: 198964
})
})
2024-05-29 23:26 - Setup PEFT
2024-05-29 23:26 - Setup optimizer
2024-05-29 23:26 - Start training
2024-05-30 01:58 - Training complete!!!
2024-05-30 02:00 - Cuda check
2024-05-30 02:00 - True
2024-05-30 02:00 - 1
2024-05-30 02:00 - Configue Model and tokenizer
2024-05-30 02:00 - Memory usage in 25.17 GB
2024-05-30 02:00 - Dataset loaded successfully:
train-Jingmei/Pandemic_CDC
test -Jingmei/Pandemic_WHO
2024-05-30 02:03 - Tokenize data: DatasetDict({
train: Dataset({
features: ['input_ids', 'attention_mask'],
num_rows: 15208
})
test: Dataset({
features: ['input_ids', 'attention_mask'],
num_rows: 8264
})
})
2024-05-30 02:08 - Split data into chunks:DatasetDict({
train: Dataset({
features: ['input_ids', 'attention_mask'],
num_rows: 364678
})
test: Dataset({
features: ['input_ids', 'attention_mask'],
num_rows: 198964
})
})
2024-05-30 02:08 - Setup PEFT
2024-05-30 02:08 - Setup optimizer
2024-05-30 02:08 - Start training
2024-05-30 17:35 - Training complete!!!
2024-05-30 19:45 - Cuda check
2024-05-30 19:45 - True
2024-05-30 19:45 - 1
2024-05-30 19:45 - Configue Model and tokenizer
2024-05-30 19:45 - Memory usage in 25.17 GB
2024-05-30 19:45 - Dataset loaded successfully:
train-Jingmei/Pandemic_ECDC
test -Jingmei/Pandemic_WHO
2024-05-30 19:46 - Tokenize data: DatasetDict({
train: Dataset({
features: ['input_ids', 'attention_mask'],
num_rows: 7008
})
test: Dataset({
features: ['input_ids', 'attention_mask'],
num_rows: 8264
})
})
2024-05-30 19:49 - Split data into chunks:DatasetDict({
train: Dataset({
features: ['input_ids', 'attention_mask'],
num_rows: 103936
})
test: Dataset({
features: ['input_ids', 'attention_mask'],
num_rows: 198964
})
})
2024-05-30 19:49 - Setup PEFT
2024-05-30 19:49 - Setup optimizer
2024-05-30 19:49 - Start training
2024-05-30 19:49 - Training complete!!!
2024-05-30 20:16 - Cuda check
2024-05-30 20:16 - True
2024-05-30 20:16 - 1
2024-05-30 20:16 - Configue Model and tokenizer
2024-05-30 20:16 - Memory usage in 25.17 GB
2024-05-30 20:16 - Dataset loaded successfully:
train-Jingmei/Pandemic_ECDC
test -Jingmei/Pandemic_WHO
2024-05-30 20:16 - Tokenize data: DatasetDict({
train: Dataset({
features: ['input_ids', 'attention_mask'],
num_rows: 7008
})
test: Dataset({
features: ['input_ids', 'attention_mask'],
num_rows: 8264
})
})
2024-05-30 20:16 - Split data into chunks:DatasetDict({
train: Dataset({
features: ['input_ids', 'attention_mask'],
num_rows: 103936
})
test: Dataset({
features: ['input_ids', 'attention_mask'],
num_rows: 198964
})
})
2024-05-30 20:16 - Setup PEFT
2024-05-30 20:16 - Setup optimizer
2024-05-30 20:16 - Resume from checkpoint - ./trainer_Wiki_lora/checkpoint-560
2024-05-30 20:16 - Training complete!!!
2024-05-30 20:30 - Cuda check
2024-05-30 20:30 - True
2024-05-30 20:30 - 1
2024-05-30 20:30 - Configue Model and tokenizer
2024-05-30 20:31 - Memory usage in 25.17 GB
2024-05-30 20:31 - Dataset loaded successfully:
train-Jingmei/Pandemic_Books
test -Jingmei/Pandemic_WHO
2024-05-30 20:33 - Tokenize data: DatasetDict({
train: Dataset({
features: ['input_ids', 'attention_mask'],
num_rows: 5966
})
test: Dataset({
features: ['input_ids', 'attention_mask'],
num_rows: 8264
})
})
2024-05-30 20:36 - Split data into chunks:DatasetDict({
train: Dataset({
features: ['input_ids', 'attention_mask'],
num_rows: 388208
})
test: Dataset({
features: ['input_ids', 'attention_mask'],
num_rows: 198964
})
})
2024-05-30 20:36 - Setup PEFT
2024-05-30 20:36 - Setup optimizer
2024-05-30 20:36 - Resume from checkpoint
2024-05-31 00:03 - Training complete!!!
2024-05-31 00:20 - Cuda check
2024-05-31 00:20 - True
2024-05-31 00:20 - 1
2024-05-31 00:20 - Configue Model and tokenizer
2024-05-31 00:20 - Memory usage in 25.17 GB
2024-05-31 00:32 - Dataset loaded successfully:
train-Jingmei/Pandemic_PMC
test -Jingmei/Pandemic_WHO
2024-05-31 01:26 - Tokenize data: DatasetDict({
train: Dataset({
features: ['input_ids', 'attention_mask'],
num_rows: 469701
})
test: Dataset({
features: ['input_ids', 'attention_mask'],
num_rows: 8264
})
})
2024-05-31 03:25 - Split data into chunks:DatasetDict({
train: Dataset({
features: ['input_ids', 'attention_mask'],
num_rows: 13199845
})
test: Dataset({
features: ['input_ids', 'attention_mask'],
num_rows: 198964
})
})
2024-05-31 03:25 - Setup PEFT
2024-05-31 03:25 - Setup optimizer
2024-05-31 03:25 - Resume from checkpoint