Rodrigo1771 commited on
Commit
ecba9ea
β€’
1 Parent(s): 83f2810

End of training

Browse files
README.md ADDED
@@ -0,0 +1,63 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ license: apache-2.0
3
+ base_model: PlanTL-GOB-ES/bsc-bio-ehr-es
4
+ tags:
5
+ - token-classification
6
+ - generated_from_trainer
7
+ datasets:
8
+ - Rodrigo1771/multi-train-distemist-dev-ner
9
+ model-index:
10
+ - name: output
11
+ results: []
12
+ ---
13
+
14
+ <!-- This model card has been generated automatically according to the information the Trainer had access to. You
15
+ should probably proofread and complete it, then remove this comment. -->
16
+
17
+ # output
18
+
19
+ This model is a fine-tuned version of [PlanTL-GOB-ES/bsc-bio-ehr-es](https://huggingface.co/PlanTL-GOB-ES/bsc-bio-ehr-es) on the Rodrigo1771/multi-train-distemist-dev-ner dataset.
20
+ It achieves the following results on the evaluation set:
21
+ - eval_loss: 2.3735
22
+ - eval_precision: 0.0045
23
+ - eval_recall: 0.1273
24
+ - eval_f1: 0.0088
25
+ - eval_accuracy: 0.0187
26
+ - eval_runtime: 16.9354
27
+ - eval_samples_per_second: 401.939
28
+ - eval_steps_per_second: 50.25
29
+ - step: 0
30
+
31
+ ## Model description
32
+
33
+ More information needed
34
+
35
+ ## Intended uses & limitations
36
+
37
+ More information needed
38
+
39
+ ## Training and evaluation data
40
+
41
+ More information needed
42
+
43
+ ## Training procedure
44
+
45
+ ### Training hyperparameters
46
+
47
+ The following hyperparameters were used during training:
48
+ - learning_rate: 5e-05
49
+ - train_batch_size: 4
50
+ - eval_batch_size: 8
51
+ - seed: 42
52
+ - gradient_accumulation_steps: 4
53
+ - total_train_batch_size: 16
54
+ - optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
55
+ - lr_scheduler_type: linear
56
+ - num_epochs: 10.0
57
+
58
+ ### Framework versions
59
+
60
+ - Transformers 4.40.2
61
+ - Pytorch 2.2.1+cu121
62
+ - Datasets 2.19.1
63
+ - Tokenizers 0.19.1
all_results.json ADDED
@@ -0,0 +1,19 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "eval_accuracy": 0.018743434648577764,
3
+ "eval_f1": 0.008761828065230522,
4
+ "eval_loss": 2.3735408782958984,
5
+ "eval_precision": 0.004537076421380973,
6
+ "eval_recall": 0.1272812353766963,
7
+ "eval_runtime": 16.9354,
8
+ "eval_samples": 6807,
9
+ "eval_samples_per_second": 401.939,
10
+ "eval_steps_per_second": 50.25,
11
+ "predict_accuracy": 0.018743434648577764,
12
+ "predict_f1": 0.008761828065230522,
13
+ "predict_loss": 2.3735408782958984,
14
+ "predict_precision": 0.004537076421380973,
15
+ "predict_recall": 0.1272812353766963,
16
+ "predict_runtime": 16.1794,
17
+ "predict_samples_per_second": 420.719,
18
+ "predict_steps_per_second": 52.598
19
+ }
config.json ADDED
@@ -0,0 +1,51 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "_name_or_path": "PlanTL-GOB-ES/bsc-bio-ehr-es",
3
+ "architectures": [
4
+ "RobertaForTokenClassification"
5
+ ],
6
+ "attention_probs_dropout_prob": 0.1,
7
+ "bos_token_id": 0,
8
+ "classifier_dropout": null,
9
+ "eos_token_id": 2,
10
+ "finetuning_task": "ner",
11
+ "gradient_checkpointing": false,
12
+ "hidden_act": "gelu",
13
+ "hidden_dropout_prob": 0.1,
14
+ "hidden_size": 768,
15
+ "id2label": {
16
+ "0": "O",
17
+ "1": "B-ENFERMEDAD",
18
+ "2": "I-ENFERMEDAD",
19
+ "3": "B-PROCEDIMIENTO",
20
+ "4": "I-PROCEDIMIENTO",
21
+ "5": "B-SINTOMA",
22
+ "6": "I-SINTOMA",
23
+ "7": "B-FARMACO",
24
+ "8": "I-FARMACO"
25
+ },
26
+ "initializer_range": 0.02,
27
+ "intermediate_size": 3072,
28
+ "label2id": {
29
+ "B-ENFERMEDAD": 1,
30
+ "B-FARMACO": 7,
31
+ "B-PROCEDIMIENTO": 3,
32
+ "B-SINTOMA": 5,
33
+ "I-ENFERMEDAD": 2,
34
+ "I-FARMACO": 8,
35
+ "I-PROCEDIMIENTO": 4,
36
+ "I-SINTOMA": 6,
37
+ "O": 0
38
+ },
39
+ "layer_norm_eps": 1e-05,
40
+ "max_position_embeddings": 514,
41
+ "model_type": "roberta",
42
+ "num_attention_heads": 12,
43
+ "num_hidden_layers": 12,
44
+ "pad_token_id": 1,
45
+ "position_embedding_type": "absolute",
46
+ "torch_dtype": "float32",
47
+ "transformers_version": "4.40.2",
48
+ "type_vocab_size": 1,
49
+ "use_cache": true,
50
+ "vocab_size": 50262
51
+ }
eval_results.json ADDED
@@ -0,0 +1,11 @@
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "eval_accuracy": 0.018743434648577764,
3
+ "eval_f1": 0.008761828065230522,
4
+ "eval_loss": 2.3735408782958984,
5
+ "eval_precision": 0.004537076421380973,
6
+ "eval_recall": 0.1272812353766963,
7
+ "eval_runtime": 16.9354,
8
+ "eval_samples": 6807,
9
+ "eval_samples_per_second": 401.939,
10
+ "eval_steps_per_second": 50.25
11
+ }
merges.txt ADDED
The diff for this file is too large to render. See raw diff
 
model.safetensors ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:6cb8f27d656928c021399304ae37e82d147c80df5386c804b66e664424a60fee
3
+ size 496262556
predict_results.json ADDED
@@ -0,0 +1,10 @@
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "predict_accuracy": 0.018743434648577764,
3
+ "predict_f1": 0.008761828065230522,
4
+ "predict_loss": 2.3735408782958984,
5
+ "predict_precision": 0.004537076421380973,
6
+ "predict_recall": 0.1272812353766963,
7
+ "predict_runtime": 16.1794,
8
+ "predict_samples_per_second": 420.719,
9
+ "predict_steps_per_second": 52.598
10
+ }
predictions.txt ADDED
The diff for this file is too large to render. See raw diff
 
special_tokens_map.json ADDED
@@ -0,0 +1,51 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "bos_token": {
3
+ "content": "<s>",
4
+ "lstrip": false,
5
+ "normalized": true,
6
+ "rstrip": false,
7
+ "single_word": false
8
+ },
9
+ "cls_token": {
10
+ "content": "<s>",
11
+ "lstrip": false,
12
+ "normalized": true,
13
+ "rstrip": false,
14
+ "single_word": false
15
+ },
16
+ "eos_token": {
17
+ "content": "</s>",
18
+ "lstrip": false,
19
+ "normalized": true,
20
+ "rstrip": false,
21
+ "single_word": false
22
+ },
23
+ "mask_token": {
24
+ "content": "<mask>",
25
+ "lstrip": true,
26
+ "normalized": true,
27
+ "rstrip": false,
28
+ "single_word": false
29
+ },
30
+ "pad_token": {
31
+ "content": "<pad>",
32
+ "lstrip": false,
33
+ "normalized": true,
34
+ "rstrip": false,
35
+ "single_word": false
36
+ },
37
+ "sep_token": {
38
+ "content": "</s>",
39
+ "lstrip": false,
40
+ "normalized": true,
41
+ "rstrip": false,
42
+ "single_word": false
43
+ },
44
+ "unk_token": {
45
+ "content": "<unk>",
46
+ "lstrip": false,
47
+ "normalized": true,
48
+ "rstrip": false,
49
+ "single_word": false
50
+ }
51
+ }
tb/events.out.tfevents.1715599919.dff07dfba241.4551.0 ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:53f2d751e4f7bee8be430707a913556ed6397cb45dac39b8842fa2e13944d1ba
3
+ size 486
tb/events.out.tfevents.1715600029.dff07dfba241.5135.0 ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:fdbb0b41e4cdcc8831058dade865be672deae192e10b50a2f2aebc3a27db9443
3
+ size 486
tokenizer.json ADDED
The diff for this file is too large to render. See raw diff
 
tokenizer_config.json ADDED
@@ -0,0 +1,58 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "add_prefix_space": true,
3
+ "added_tokens_decoder": {
4
+ "0": {
5
+ "content": "<s>",
6
+ "lstrip": false,
7
+ "normalized": true,
8
+ "rstrip": false,
9
+ "single_word": false,
10
+ "special": true
11
+ },
12
+ "1": {
13
+ "content": "<pad>",
14
+ "lstrip": false,
15
+ "normalized": true,
16
+ "rstrip": false,
17
+ "single_word": false,
18
+ "special": true
19
+ },
20
+ "2": {
21
+ "content": "</s>",
22
+ "lstrip": false,
23
+ "normalized": true,
24
+ "rstrip": false,
25
+ "single_word": false,
26
+ "special": true
27
+ },
28
+ "3": {
29
+ "content": "<unk>",
30
+ "lstrip": false,
31
+ "normalized": true,
32
+ "rstrip": false,
33
+ "single_word": false,
34
+ "special": true
35
+ },
36
+ "50261": {
37
+ "content": "<mask>",
38
+ "lstrip": true,
39
+ "normalized": true,
40
+ "rstrip": false,
41
+ "single_word": false,
42
+ "special": true
43
+ }
44
+ },
45
+ "bos_token": "<s>",
46
+ "clean_up_tokenization_spaces": true,
47
+ "cls_token": "<s>",
48
+ "eos_token": "</s>",
49
+ "errors": "replace",
50
+ "mask_token": "<mask>",
51
+ "max_len": 512,
52
+ "model_max_length": 512,
53
+ "pad_token": "<pad>",
54
+ "sep_token": "</s>",
55
+ "tokenizer_class": "RobertaTokenizer",
56
+ "trim_offsets": true,
57
+ "unk_token": "<unk>"
58
+ }
train.log ADDED
@@ -0,0 +1,357 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
0
  0%| | 0/851 [00:00<?, ?it/s]
1
  1%| | 7/851 [00:00<00:12, 69.99it/s]
2
  2%|▏ | 15/851 [00:00<00:11, 72.61it/s]
3
  3%|β–Ž | 23/851 [00:00<00:11, 71.79it/s]
4
  4%|β–Ž | 31/851 [00:00<00:11, 71.34it/s]
5
  5%|▍ | 39/851 [00:00<00:11, 73.33it/s]
6
  6%|β–Œ | 47/851 [00:00<00:10, 74.03it/s]
7
  6%|β–‹ | 55/851 [00:00<00:10, 75.21it/s]
8
  7%|β–‹ | 63/851 [00:00<00:10, 75.11it/s]
9
  8%|β–Š | 71/851 [00:00<00:11, 70.20it/s]
10
  9%|β–‰ | 79/851 [00:01<00:10, 71.93it/s]
11
  10%|β–ˆ | 87/851 [00:01<00:10, 72.37it/s]
12
  11%|β–ˆ | 95/851 [00:01<00:10, 70.10it/s]
13
  12%|β–ˆβ– | 103/851 [00:01<00:10, 72.17it/s]
14
  13%|β–ˆβ–Ž | 111/851 [00:01<00:10, 71.21it/s]
15
  14%|β–ˆβ– | 119/851 [00:01<00:10, 72.24it/s]
16
  15%|β–ˆβ– | 127/851 [00:01<00:10, 68.05it/s]
17
  16%|β–ˆβ–Œ | 135/851 [00:01<00:10, 69.82it/s]
18
  17%|β–ˆβ–‹ | 143/851 [00:02<00:10, 69.72it/s]
19
  18%|β–ˆβ–Š | 151/851 [00:02<00:10, 66.80it/s]
20
  19%|β–ˆβ–‰ | 160/851 [00:02<00:09, 71.09it/s]
21
  20%|β–ˆβ–‰ | 168/851 [00:02<00:09, 71.81it/s]
22
  21%|β–ˆβ–ˆ | 176/851 [00:02<00:09, 73.08it/s]
23
  22%|β–ˆβ–ˆβ– | 184/851 [00:02<00:09, 73.12it/s]
24
  23%|β–ˆβ–ˆβ–Ž | 192/851 [00:02<00:08, 74.39it/s]
25
  24%|β–ˆβ–ˆβ–Ž | 200/851 [00:02<00:08, 75.94it/s]
26
  24%|β–ˆβ–ˆβ– | 208/851 [00:02<00:08, 73.49it/s]
27
  25%|β–ˆβ–ˆβ–Œ | 216/851 [00:03<00:08, 71.64it/s]
28
  26%|β–ˆβ–ˆβ–‹ | 224/851 [00:03<00:08, 71.02it/s]
29
  27%|β–ˆβ–ˆβ–‹ | 232/851 [00:03<00:08, 73.33it/s]
30
  28%|β–ˆβ–ˆβ–Š | 240/851 [00:03<00:08, 71.22it/s]
31
  29%|β–ˆβ–ˆβ–‰ | 248/851 [00:03<00:08, 69.25it/s]
32
  30%|β–ˆβ–ˆβ–ˆ | 257/851 [00:03<00:08, 72.74it/s]
33
  31%|β–ˆβ–ˆβ–ˆβ– | 266/851 [00:03<00:07, 75.27it/s]
34
  32%|β–ˆβ–ˆβ–ˆβ– | 274/851 [00:03<00:07, 73.53it/s]
35
  33%|β–ˆβ–ˆβ–ˆβ–Ž | 283/851 [00:03<00:07, 75.83it/s]
36
  34%|β–ˆβ–ˆβ–ˆβ– | 291/851 [00:04<00:07, 74.19it/s]
37
  35%|β–ˆβ–ˆβ–ˆβ–Œ | 299/851 [00:04<00:07, 75.63it/s]
38
  36%|β–ˆβ–ˆβ–ˆβ–Œ | 308/851 [00:04<00:07, 77.05it/s]
39
  37%|β–ˆβ–ˆβ–ˆβ–‹ | 316/851 [00:04<00:07, 72.54it/s]
40
  38%|β–ˆβ–ˆβ–ˆβ–Š | 324/851 [00:04<00:07, 73.26it/s]
41
  39%|β–ˆβ–ˆβ–ˆβ–‰ | 332/851 [00:04<00:07, 72.27it/s]
42
  40%|β–ˆβ–ˆβ–ˆβ–‰ | 340/851 [00:04<00:06, 73.29it/s]
43
  41%|β–ˆβ–ˆβ–ˆβ–ˆ | 348/851 [00:04<00:06, 74.09it/s]
44
  42%|β–ˆβ–ˆβ–ˆβ–ˆβ– | 356/851 [00:04<00:06, 71.17it/s]
45
  43%|β–ˆβ–ˆβ–ˆβ–ˆβ–Ž | 364/851 [00:05<00:06, 71.16it/s]
46
  44%|β–ˆβ–ˆβ–ˆβ–ˆβ–Ž | 372/851 [00:05<00:06, 70.58it/s]
47
  45%|β–ˆβ–ˆβ–ˆβ–ˆβ– | 380/851 [00:05<00:06, 67.86it/s]
48
  46%|β–ˆβ–ˆβ–ˆβ–ˆβ–Œ | 388/851 [00:05<00:06, 70.57it/s]
49
  47%|β–ˆβ–ˆβ–ˆβ–ˆβ–‹ | 396/851 [00:05<00:06, 71.36it/s]
50
  47%|β–ˆβ–ˆβ–ˆβ–ˆβ–‹ | 404/851 [00:05<00:06, 66.91it/s]
51
  48%|β–ˆβ–ˆβ–ˆβ–ˆβ–Š | 412/851 [00:05<00:06, 67.91it/s]
52
  49%|β–ˆβ–ˆβ–ˆβ–ˆβ–‰ | 420/851 [00:05<00:06, 70.89it/s]
53
  50%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆ | 428/851 [00:05<00:06, 69.31it/s]
54
  51%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆ | 436/851 [00:06<00:05, 71.73it/s]
55
  52%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 444/851 [00:06<00:05, 72.78it/s]
56
  53%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Ž | 452/851 [00:06<00:05, 72.01it/s]
57
  54%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 460/851 [00:06<00:05, 71.19it/s]
58
  55%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 468/851 [00:06<00:05, 68.93it/s]
59
  56%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Œ | 475/851 [00:06<00:05, 64.78it/s]
60
  57%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‹ | 482/851 [00:06<00:05, 65.43it/s]
61
  57%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‹ | 489/851 [00:06<00:05, 66.13it/s]
62
  59%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Š | 498/851 [00:06<00:04, 70.86it/s]
63
  59%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‰ | 506/851 [00:07<00:04, 73.13it/s]
64
  60%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ | 514/851 [00:07<00:04, 72.24it/s]
65
  61%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 522/851 [00:07<00:04, 67.88it/s]
66
  62%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 529/851 [00:07<00:04, 67.10it/s]
67
  63%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Ž | 537/851 [00:07<00:04, 69.73it/s]
68
  64%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 545/851 [00:07<00:04, 70.97it/s]
69
  65%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 553/851 [00:07<00:04, 68.72it/s]
70
  66%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Œ | 561/851 [00:07<00:04, 71.78it/s]
71
  67%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‹ | 569/851 [00:07<00:03, 73.74it/s]
72
  68%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Š | 577/851 [00:08<00:03, 74.03it/s]
73
  69%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Š | 585/851 [00:08<00:03, 69.31it/s]
74
  70%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‰ | 593/851 [00:08<00:03, 68.71it/s]
75
  71%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ | 601/851 [00:08<00:03, 69.53it/s]
76
  72%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 609/851 [00:08<00:03, 69.71it/s]
77
  73%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Ž | 617/851 [00:08<00:03, 65.18it/s]
78
  73%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Ž | 625/851 [00:08<00:03, 67.75it/s]
79
  74%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 632/851 [00:08<00:03, 66.40it/s]
80
  75%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Œ | 639/851 [00:09<00:03, 66.73it/s]
81
  76%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Œ | 646/851 [00:09<00:03, 64.45it/s]
82
  77%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‹ | 654/851 [00:09<00:02, 67.81it/s]
83
  78%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Š | 662/851 [00:09<00:02, 69.27it/s]
84
  79%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Š | 670/851 [00:09<00:02, 70.30it/s]
85
  80%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‰ | 678/851 [00:09<00:02, 70.89it/s]
86
  81%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ | 686/851 [00:09<00:02, 71.36it/s]
87
  82%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 694/851 [00:09<00:02, 73.14it/s]
88
  82%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 702/851 [00:09<00:01, 74.61it/s]
89
  83%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Ž | 710/851 [00:09<00:01, 76.06it/s]
90
  84%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 718/851 [00:10<00:01, 73.20it/s]
91
  85%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Œ | 726/851 [00:10<00:01, 74.44it/s]
92
  86%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‹ | 734/851 [00:10<00:01, 73.01it/s]
93
  87%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‹ | 742/851 [00:10<00:01, 74.39it/s]
94
  88%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Š | 750/851 [00:10<00:01, 74.30it/s]
95
  89%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‰ | 758/851 [00:10<00:01, 74.83it/s]
96
  90%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ | 766/851 [00:10<00:01, 71.22it/s]
97
  91%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ | 774/851 [00:10<00:01, 71.36it/s]
98
  92%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–| 782/851 [00:10<00:00, 69.72it/s]
99
  93%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Ž| 790/851 [00:11<00:00, 70.57it/s]
100
  94%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–| 798/851 [00:11<00:00, 71.35it/s]
101
  95%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–| 806/851 [00:11<00:00, 73.28it/s]
102
  96%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Œ| 814/851 [00:11<00:00, 69.76it/s]
103
  97%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‹| 822/851 [00:11<00:00, 71.00it/s]
104
  98%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Š| 830/851 [00:11<00:00, 71.19it/s]
105
  98%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Š| 838/851 [00:11<00:00, 71.54it/s]
106
  99%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‰| 846/851 [00:11<00:00, 67.56it/s]/usr/local/lib/python3.10/dist-packages/seqeval/metrics/v1.py:57: UndefinedMetricWarning: Recall and F-score are ill-defined and being set to 0.0 in labels with no true samples. Use `zero_division` parameter to control this behavior.
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
107
  0%| | 0/851 [00:00<?, ?it/s]
108
  1%| | 10/851 [00:00<00:09, 93.41it/s]
109
  2%|▏ | 20/851 [00:00<00:10, 80.25it/s]
110
  3%|β–Ž | 29/851 [00:00<00:10, 77.17it/s]
111
  4%|▍ | 37/851 [00:00<00:10, 75.63it/s]
112
  5%|β–Œ | 45/851 [00:00<00:10, 75.87it/s]
113
  6%|β–Œ | 53/851 [00:00<00:10, 76.47it/s]
114
  7%|β–‹ | 62/851 [00:00<00:10, 77.26it/s]
115
  8%|β–Š | 70/851 [00:00<00:10, 71.62it/s]
116
  9%|β–‰ | 78/851 [00:01<00:10, 71.87it/s]
117
  10%|β–ˆ | 86/851 [00:01<00:10, 72.52it/s]
118
  11%|β–ˆ | 94/851 [00:01<00:10, 70.09it/s]
119
  12%|β–ˆβ– | 102/851 [00:01<00:10, 71.95it/s]
120
  13%|β–ˆβ–Ž | 110/851 [00:01<00:10, 71.06it/s]
121
  14%|β–ˆβ– | 118/851 [00:01<00:10, 72.15it/s]
122
  15%|β–ˆβ– | 126/851 [00:01<00:10, 68.63it/s]
123
  16%|β–ˆβ–Œ | 133/851 [00:01<00:10, 68.94it/s]
124
  16%|β–ˆβ–‹ | 140/851 [00:01<00:10, 68.78it/s]
125
  17%|β–ˆβ–‹ | 148/851 [00:02<00:10, 67.78it/s]
126
  18%|β–ˆβ–Š | 156/851 [00:02<00:09, 70.02it/s]
127
  19%|β–ˆβ–‰ | 164/851 [00:02<00:09, 70.66it/s]
128
  20%|β–ˆβ–ˆ | 172/851 [00:02<00:09, 72.98it/s]
129
  21%|β–ˆβ–ˆ | 180/851 [00:02<00:09, 73.53it/s]
130
  22%|β–ˆβ–ˆβ– | 188/851 [00:02<00:08, 74.03it/s]
131
  23%|β–ˆβ–ˆβ–Ž | 196/851 [00:02<00:08, 75.22it/s]
132
  24%|β–ˆβ–ˆβ– | 204/851 [00:02<00:08, 76.42it/s]
133
  25%|β–ˆβ–ˆβ– | 212/851 [00:02<00:08, 71.76it/s]
134
  26%|β–ˆβ–ˆβ–Œ | 220/851 [00:03<00:08, 70.32it/s]
135
  27%|β–ˆβ–ˆβ–‹ | 228/851 [00:03<00:08, 71.73it/s]
136
  28%|β–ˆβ–ˆβ–Š | 236/851 [00:03<00:08, 71.56it/s]
137
  29%|β–ˆβ–ˆβ–Š | 244/851 [00:03<00:08, 67.69it/s]
138
  30%|β–ˆβ–ˆβ–‰ | 253/851 [00:03<00:08, 71.63it/s]
139
  31%|β–ˆβ–ˆβ–ˆ | 262/851 [00:03<00:07, 74.49it/s]
140
  32%|β–ˆβ–ˆβ–ˆβ– | 270/851 [00:03<00:07, 72.98it/s]
141
  33%|β–ˆβ–ˆβ–ˆβ–Ž | 279/851 [00:03<00:07, 75.66it/s]
142
  34%|β–ˆβ–ˆβ–ˆβ–Ž | 287/851 [00:03<00:07, 74.53it/s]
143
  35%|β–ˆβ–ˆβ–ˆβ– | 295/851 [00:04<00:07, 74.21it/s]
144
  36%|β–ˆβ–ˆβ–ˆβ–Œ | 304/851 [00:04<00:07, 76.50it/s]
145
  37%|β–ˆβ–ˆβ–ˆβ–‹ | 312/851 [00:04<00:07, 72.50it/s]
146
  38%|β–ˆβ–ˆβ–ˆβ–Š | 321/851 [00:04<00:06, 75.74it/s]
147
  39%|β–ˆβ–ˆβ–ˆβ–Š | 329/851 [00:04<00:07, 74.05it/s]
148
  40%|β–ˆβ–ˆβ–ˆβ–‰ | 337/851 [00:04<00:06, 73.96it/s]
149
  41%|β–ˆβ–ˆβ–ˆβ–ˆ | 346/851 [00:04<00:06, 74.14it/s]
150
  42%|β–ˆβ–ˆβ–ˆβ–ˆβ– | 354/851 [00:04<00:06, 71.38it/s]
151
  43%|β–ˆβ–ˆβ–ˆβ–ˆβ–Ž | 362/851 [00:04<00:06, 70.84it/s]
152
  43%|β–ˆβ–ˆβ–ˆβ–ˆβ–Ž | 370/851 [00:05<00:06, 70.89it/s]
153
  44%|β–ˆβ–ˆβ–ˆβ–ˆβ– | 378/851 [00:05<00:06, 70.69it/s]
154
  45%|β–ˆβ–ˆβ–ˆβ–ˆβ–Œ | 386/851 [00:05<00:06, 70.06it/s]
155
  46%|β–ˆβ–ˆβ–ˆβ–ˆβ–‹ | 394/851 [00:05<00:06, 70.15it/s]
156
  47%|β–ˆβ–ˆβ–ˆβ–ˆβ–‹ | 402/851 [00:05<00:06, 70.60it/s]
157
  48%|β–ˆβ–ˆβ–ˆβ–ˆβ–Š | 410/851 [00:05<00:06, 67.26it/s]
158
  49%|β–ˆβ–ˆβ–ˆβ–ˆβ–‰ | 418/851 [00:05<00:06, 69.32it/s]
159
  50%|β–ˆβ–ˆβ–ˆβ–ˆβ–‰ | 425/851 [00:05<00:06, 67.83it/s]
160
  51%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆ | 433/851 [00:05<00:05, 70.62it/s]
161
  52%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 441/851 [00:06<00:05, 72.80it/s]
162
  53%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Ž | 449/851 [00:06<00:05, 71.68it/s]
163
  54%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Ž | 457/851 [00:06<00:05, 72.72it/s]
164
  55%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 465/851 [00:06<00:05, 69.67it/s]
165
  56%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Œ | 473/851 [00:06<00:05, 63.94it/s]
166
  57%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‹ | 481/851 [00:06<00:05, 65.25it/s]
167
  57%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‹ | 489/851 [00:06<00:05, 66.10it/s]
168
  59%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Š | 498/851 [00:06<00:04, 70.83it/s]
169
  59%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‰ | 506/851 [00:07<00:04, 73.00it/s]
170
  60%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ | 514/851 [00:07<00:04, 72.47it/s]
171
  61%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 522/851 [00:07<00:04, 68.53it/s]
172
  62%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 529/851 [00:07<00:04, 67.57it/s]
173
  63%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Ž | 537/851 [00:07<00:04, 69.88it/s]
174
  64%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 545/851 [00:07<00:04, 70.84it/s]
175
  65%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 553/851 [00:07<00:04, 68.71it/s]
176
  66%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Œ | 562/851 [00:07<00:03, 72.27it/s]
177
  67%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‹ | 570/851 [00:07<00:03, 73.35it/s]
178
  68%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Š | 578/851 [00:08<00:03, 71.63it/s]
179
  69%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‰ | 586/851 [00:08<00:03, 69.61it/s]
180
  70%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‰ | 593/851 [00:08<00:03, 69.40it/s]
181
  71%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ | 601/851 [00:08<00:03, 70.17it/s]
182
  72%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 609/851 [00:08<00:03, 70.14it/s]
183
  73%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Ž | 617/851 [00:08<00:03, 65.67it/s]
184
  73%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Ž | 625/851 [00:08<00:03, 68.06it/s]
185
  74%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 632/851 [00:08<00:03, 65.68it/s]
186
  75%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Œ | 639/851 [00:08<00:03, 66.01it/s]
187
  76%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Œ | 646/851 [00:09<00:03, 63.53it/s]
188
  77%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‹ | 654/851 [00:09<00:02, 67.46it/s]
189
  78%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Š | 662/851 [00:09<00:02, 68.21it/s]
190
  79%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Š | 670/851 [00:09<00:02, 69.59it/s]
191
  80%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‰ | 678/851 [00:09<00:02, 70.67it/s]
192
  81%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ | 686/851 [00:09<00:02, 71.63it/s]
193
  82%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 694/851 [00:09<00:02, 73.38it/s]
194
  82%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 702/851 [00:09<00:01, 74.51it/s]
195
  83%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Ž | 710/851 [00:09<00:01, 75.85it/s]
196
  84%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 718/851 [00:10<00:01, 73.42it/s]
197
  85%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Œ | 726/851 [00:10<00:01, 74.64it/s]
198
  86%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‹ | 734/851 [00:10<00:01, 75.67it/s]
199
  87%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‹ | 742/851 [00:10<00:01, 75.71it/s]
200
  88%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Š | 750/851 [00:10<00:01, 75.06it/s]
201
  89%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‰ | 758/851 [00:10<00:01, 75.32it/s]
202
  90%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ | 766/851 [00:10<00:01, 71.14it/s]
203
  91%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ | 774/851 [00:10<00:01, 71.13it/s]
204
  92%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–| 782/851 [00:10<00:00, 69.77it/s]
205
  93%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Ž| 790/851 [00:11<00:00, 70.89it/s]
206
  94%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–| 798/851 [00:11<00:00, 71.65it/s]
207
  95%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–| 807/851 [00:11<00:00, 74.32it/s]
208
  96%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Œ| 815/851 [00:11<00:00, 71.09it/s]
209
  97%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‹| 823/851 [00:11<00:00, 71.28it/s]
210
  98%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Š| 831/851 [00:11<00:00, 71.56it/s]
211
  99%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Š| 839/851 [00:11<00:00, 70.65it/s]
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ 2024-05-13 11:33:22.284311: E external/local_xla/xla/stream_executor/cuda/cuda_dnn.cc:9261] Unable to register cuDNN factory: Attempting to register factory for plugin cuDNN when one has already been registered
2
+ 2024-05-13 11:33:22.284357: E external/local_xla/xla/stream_executor/cuda/cuda_fft.cc:607] Unable to register cuFFT factory: Attempting to register factory for plugin cuFFT when one has already been registered
3
+ 2024-05-13 11:33:22.286270: E external/local_xla/xla/stream_executor/cuda/cuda_blas.cc:1515] Unable to register cuBLAS factory: Attempting to register factory for plugin cuBLAS when one has already been registered
4
+ 2024-05-13 11:33:23.400293: W tensorflow/compiler/tf2tensorrt/utils/py_utils.cc:38] TF-TRT Warning: Could not find TensorRT
5
+ 05/13/2024 11:33:25 - WARNING - __main__ - Process rank: 0, device: cuda:0, n_gpu: 1distributed training: True, 16-bits training: False
6
+ 05/13/2024 11:33:25 - INFO - __main__ - Training/evaluation parameters TrainingArguments(
7
+ _n_gpu=1,
8
+ accelerator_config={'split_batches': False, 'dispatch_batches': None, 'even_batches': True, 'use_seedable_sampler': True, 'gradient_accumulation_kwargs': None},
9
+ adafactor=False,
10
+ adam_beta1=0.9,
11
+ adam_beta2=0.999,
12
+ adam_epsilon=1e-08,
13
+ auto_find_batch_size=False,
14
+ bf16=False,
15
+ bf16_full_eval=False,
16
+ data_seed=None,
17
+ dataloader_drop_last=False,
18
+ dataloader_num_workers=0,
19
+ dataloader_persistent_workers=False,
20
+ dataloader_pin_memory=True,
21
+ dataloader_prefetch_factor=None,
22
+ ddp_backend=None,
23
+ ddp_broadcast_buffers=None,
24
+ ddp_bucket_cap_mb=None,
25
+ ddp_find_unused_parameters=None,
26
+ ddp_timeout=1800,
27
+ debug=[],
28
+ deepspeed=None,
29
+ disable_tqdm=False,
30
+ dispatch_batches=None,
31
+ do_eval=True,
32
+ do_predict=True,
33
+ do_train=False,
34
+ eval_accumulation_steps=None,
35
+ eval_delay=0,
36
+ eval_do_concat_batches=True,
37
+ eval_steps=None,
38
+ evaluation_strategy=epoch,
39
+ fp16=False,
40
+ fp16_backend=auto,
41
+ fp16_full_eval=False,
42
+ fp16_opt_level=O1,
43
+ fsdp=[],
44
+ fsdp_config={'min_num_params': 0, 'xla': False, 'xla_fsdp_v2': False, 'xla_fsdp_grad_ckpt': False},
45
+ fsdp_min_num_params=0,
46
+ fsdp_transformer_layer_cls_to_wrap=None,
47
+ full_determinism=False,
48
+ gradient_accumulation_steps=4,
49
+ gradient_checkpointing=False,
50
+ gradient_checkpointing_kwargs=None,
51
+ greater_is_better=True,
52
+ group_by_length=False,
53
+ half_precision_backend=auto,
54
+ hub_always_push=False,
55
+ hub_model_id=None,
56
+ hub_private_repo=False,
57
+ hub_strategy=every_save,
58
+ hub_token=<HUB_TOKEN>,
59
+ ignore_data_skip=False,
60
+ include_inputs_for_metrics=False,
61
+ include_num_input_tokens_seen=False,
62
+ include_tokens_per_second=False,
63
+ jit_mode_eval=False,
64
+ label_names=None,
65
+ label_smoothing_factor=0.0,
66
+ learning_rate=5e-05,
67
+ length_column_name=length,
68
+ load_best_model_at_end=True,
69
+ local_rank=0,
70
+ log_level=passive,
71
+ log_level_replica=warning,
72
+ log_on_each_node=True,
73
+ logging_dir=/content/dissertation/scripts/ner/output/tb,
74
+ logging_first_step=False,
75
+ logging_nan_inf_filter=True,
76
+ logging_steps=500,
77
+ logging_strategy=steps,
78
+ lr_scheduler_kwargs={},
79
+ lr_scheduler_type=linear,
80
+ max_grad_norm=1.0,
81
+ max_steps=-1,
82
+ metric_for_best_model=f1,
83
+ mp_parameters=,
84
+ neftune_noise_alpha=None,
85
+ no_cuda=False,
86
+ num_train_epochs=10.0,
87
+ optim=adamw_torch,
88
+ optim_args=None,
89
+ optim_target_modules=None,
90
+ output_dir=/content/dissertation/scripts/ner/output,
91
+ overwrite_output_dir=True,
92
+ past_index=-1,
93
+ per_device_eval_batch_size=8,
94
+ per_device_train_batch_size=4,
95
+ prediction_loss_only=False,
96
+ push_to_hub=True,
97
+ push_to_hub_model_id=None,
98
+ push_to_hub_organization=None,
99
+ push_to_hub_token=<PUSH_TO_HUB_TOKEN>,
100
+ ray_scope=last,
101
+ remove_unused_columns=True,
102
+ report_to=['tensorboard'],
103
+ resume_from_checkpoint=None,
104
+ run_name=/content/dissertation/scripts/ner/output,
105
+ save_on_each_node=False,
106
+ save_only_model=False,
107
+ save_safetensors=True,
108
+ save_steps=500,
109
+ save_strategy=epoch,
110
+ save_total_limit=None,
111
+ seed=42,
112
+ skip_memory_metrics=True,
113
+ split_batches=None,
114
+ tf32=None,
115
+ torch_compile=False,
116
+ torch_compile_backend=None,
117
+ torch_compile_mode=None,
118
+ torchdynamo=None,
119
+ tpu_metrics_debug=False,
120
+ tpu_num_cores=None,
121
+ use_cpu=False,
122
+ use_ipex=False,
123
+ use_legacy_prediction_loop=False,
124
+ use_mps_device=False,
125
+ warmup_ratio=0.0,
126
+ warmup_steps=0,
127
+ weight_decay=0.0,
128
+ )
129
+ /usr/local/lib/python3.10/dist-packages/datasets/load.py:1486: FutureWarning: The repository for Rodrigo1771/multi-train-distemist-dev-ner contains custom code which must be executed to correctly load the dataset. You can inspect the repository content at https://hf.co/datasets/Rodrigo1771/multi-train-distemist-dev-ner
130
+ You can avoid this message in future by passing the argument `trust_remote_code=True`.
131
+ Passing `trust_remote_code=True` will be mandatory to load this dataset from the next major release of `datasets`.
132
+ warnings.warn(
133
+ /usr/local/lib/python3.10/dist-packages/huggingface_hub/file_download.py:1132: FutureWarning: `resume_download` is deprecated and will be removed in version 1.0.0. Downloads always resume when possible. If you want to force a new download, use `force_download=True`.
134
+ warnings.warn(
135
+ [INFO|configuration_utils.py:726] 2024-05-13 11:33:29,256 >> loading configuration file config.json from cache at /root/.cache/huggingface/hub/models--PlanTL-GOB-ES--bsc-bio-ehr-es/snapshots/1e543adb2d21f19d85a89305eebdbd64ab656b99/config.json
136
+ [INFO|configuration_utils.py:789] 2024-05-13 11:33:29,260 >> Model config RobertaConfig {
137
+ "_name_or_path": "PlanTL-GOB-ES/bsc-bio-ehr-es",
138
+ "architectures": [
139
+ "RobertaForMaskedLM"
140
+ ],
141
+ "attention_probs_dropout_prob": 0.1,
142
+ "bos_token_id": 0,
143
+ "classifier_dropout": null,
144
+ "eos_token_id": 2,
145
+ "finetuning_task": "ner",
146
+ "gradient_checkpointing": false,
147
+ "hidden_act": "gelu",
148
+ "hidden_dropout_prob": 0.1,
149
+ "hidden_size": 768,
150
+ "id2label": {
151
+ "0": "O",
152
+ "1": "B-ENFERMEDAD",
153
+ "2": "I-ENFERMEDAD",
154
+ "3": "B-PROCEDIMIENTO",
155
+ "4": "I-PROCEDIMIENTO",
156
+ "5": "B-SINTOMA",
157
+ "6": "I-SINTOMA",
158
+ "7": "B-FARMACO",
159
+ "8": "I-FARMACO"
160
+ },
161
+ "initializer_range": 0.02,
162
+ "intermediate_size": 3072,
163
+ "label2id": {
164
+ "B-ENFERMEDAD": 1,
165
+ "B-FARMACO": 7,
166
+ "B-PROCEDIMIENTO": 3,
167
+ "B-SINTOMA": 5,
168
+ "I-ENFERMEDAD": 2,
169
+ "I-FARMACO": 8,
170
+ "I-PROCEDIMIENTO": 4,
171
+ "I-SINTOMA": 6,
172
+ "O": 0
173
+ },
174
+ "layer_norm_eps": 1e-05,
175
+ "max_position_embeddings": 514,
176
+ "model_type": "roberta",
177
+ "num_attention_heads": 12,
178
+ "num_hidden_layers": 12,
179
+ "pad_token_id": 1,
180
+ "position_embedding_type": "absolute",
181
+ "transformers_version": "4.40.2",
182
+ "type_vocab_size": 1,
183
+ "use_cache": true,
184
+ "vocab_size": 50262
185
+ }
186
+
187
+ [INFO|configuration_utils.py:726] 2024-05-13 11:33:29,519 >> loading configuration file config.json from cache at /root/.cache/huggingface/hub/models--PlanTL-GOB-ES--bsc-bio-ehr-es/snapshots/1e543adb2d21f19d85a89305eebdbd64ab656b99/config.json
188
+ [INFO|configuration_utils.py:789] 2024-05-13 11:33:29,520 >> Model config RobertaConfig {
189
+ "_name_or_path": "PlanTL-GOB-ES/bsc-bio-ehr-es",
190
+ "architectures": [
191
+ "RobertaForMaskedLM"
192
+ ],
193
+ "attention_probs_dropout_prob": 0.1,
194
+ "bos_token_id": 0,
195
+ "classifier_dropout": null,
196
+ "eos_token_id": 2,
197
+ "gradient_checkpointing": false,
198
+ "hidden_act": "gelu",
199
+ "hidden_dropout_prob": 0.1,
200
+ "hidden_size": 768,
201
+ "initializer_range": 0.02,
202
+ "intermediate_size": 3072,
203
+ "layer_norm_eps": 1e-05,
204
+ "max_position_embeddings": 514,
205
+ "model_type": "roberta",
206
+ "num_attention_heads": 12,
207
+ "num_hidden_layers": 12,
208
+ "pad_token_id": 1,
209
+ "position_embedding_type": "absolute",
210
+ "transformers_version": "4.40.2",
211
+ "type_vocab_size": 1,
212
+ "use_cache": true,
213
+ "vocab_size": 50262
214
+ }
215
+
216
+ [INFO|tokenization_utils_base.py:2087] 2024-05-13 11:33:29,529 >> loading file vocab.json from cache at /root/.cache/huggingface/hub/models--PlanTL-GOB-ES--bsc-bio-ehr-es/snapshots/1e543adb2d21f19d85a89305eebdbd64ab656b99/vocab.json
217
+ [INFO|tokenization_utils_base.py:2087] 2024-05-13 11:33:29,529 >> loading file merges.txt from cache at /root/.cache/huggingface/hub/models--PlanTL-GOB-ES--bsc-bio-ehr-es/snapshots/1e543adb2d21f19d85a89305eebdbd64ab656b99/merges.txt
218
+ [INFO|tokenization_utils_base.py:2087] 2024-05-13 11:33:29,530 >> loading file tokenizer.json from cache at None
219
+ [INFO|tokenization_utils_base.py:2087] 2024-05-13 11:33:29,530 >> loading file added_tokens.json from cache at None
220
+ [INFO|tokenization_utils_base.py:2087] 2024-05-13 11:33:29,530 >> loading file special_tokens_map.json from cache at /root/.cache/huggingface/hub/models--PlanTL-GOB-ES--bsc-bio-ehr-es/snapshots/1e543adb2d21f19d85a89305eebdbd64ab656b99/special_tokens_map.json
221
+ [INFO|tokenization_utils_base.py:2087] 2024-05-13 11:33:29,530 >> loading file tokenizer_config.json from cache at /root/.cache/huggingface/hub/models--PlanTL-GOB-ES--bsc-bio-ehr-es/snapshots/1e543adb2d21f19d85a89305eebdbd64ab656b99/tokenizer_config.json
222
+ [INFO|configuration_utils.py:726] 2024-05-13 11:33:29,530 >> loading configuration file config.json from cache at /root/.cache/huggingface/hub/models--PlanTL-GOB-ES--bsc-bio-ehr-es/snapshots/1e543adb2d21f19d85a89305eebdbd64ab656b99/config.json
223
+ [INFO|configuration_utils.py:789] 2024-05-13 11:33:29,531 >> Model config RobertaConfig {
224
+ "_name_or_path": "PlanTL-GOB-ES/bsc-bio-ehr-es",
225
+ "architectures": [
226
+ "RobertaForMaskedLM"
227
+ ],
228
+ "attention_probs_dropout_prob": 0.1,
229
+ "bos_token_id": 0,
230
+ "classifier_dropout": null,
231
+ "eos_token_id": 2,
232
+ "gradient_checkpointing": false,
233
+ "hidden_act": "gelu",
234
+ "hidden_dropout_prob": 0.1,
235
+ "hidden_size": 768,
236
+ "initializer_range": 0.02,
237
+ "intermediate_size": 3072,
238
+ "layer_norm_eps": 1e-05,
239
+ "max_position_embeddings": 514,
240
+ "model_type": "roberta",
241
+ "num_attention_heads": 12,
242
+ "num_hidden_layers": 12,
243
+ "pad_token_id": 1,
244
+ "position_embedding_type": "absolute",
245
+ "transformers_version": "4.40.2",
246
+ "type_vocab_size": 1,
247
+ "use_cache": true,
248
+ "vocab_size": 50262
249
+ }
250
+
251
+ [INFO|configuration_utils.py:726] 2024-05-13 11:33:29,608 >> loading configuration file config.json from cache at /root/.cache/huggingface/hub/models--PlanTL-GOB-ES--bsc-bio-ehr-es/snapshots/1e543adb2d21f19d85a89305eebdbd64ab656b99/config.json
252
+ [INFO|configuration_utils.py:789] 2024-05-13 11:33:29,609 >> Model config RobertaConfig {
253
+ "_name_or_path": "PlanTL-GOB-ES/bsc-bio-ehr-es",
254
+ "architectures": [
255
+ "RobertaForMaskedLM"
256
+ ],
257
+ "attention_probs_dropout_prob": 0.1,
258
+ "bos_token_id": 0,
259
+ "classifier_dropout": null,
260
+ "eos_token_id": 2,
261
+ "gradient_checkpointing": false,
262
+ "hidden_act": "gelu",
263
+ "hidden_dropout_prob": 0.1,
264
+ "hidden_size": 768,
265
+ "initializer_range": 0.02,
266
+ "intermediate_size": 3072,
267
+ "layer_norm_eps": 1e-05,
268
+ "max_position_embeddings": 514,
269
+ "model_type": "roberta",
270
+ "num_attention_heads": 12,
271
+ "num_hidden_layers": 12,
272
+ "pad_token_id": 1,
273
+ "position_embedding_type": "absolute",
274
+ "transformers_version": "4.40.2",
275
+ "type_vocab_size": 1,
276
+ "use_cache": true,
277
+ "vocab_size": 50262
278
+ }
279
+
280
+ [INFO|modeling_utils.py:3429] 2024-05-13 11:33:30,009 >> loading weights file pytorch_model.bin from cache at /root/.cache/huggingface/hub/models--PlanTL-GOB-ES--bsc-bio-ehr-es/snapshots/1e543adb2d21f19d85a89305eebdbd64ab656b99/pytorch_model.bin
281
+ [INFO|modeling_utils.py:4160] 2024-05-13 11:33:30,135 >> Some weights of the model checkpoint at PlanTL-GOB-ES/bsc-bio-ehr-es were not used when initializing RobertaForTokenClassification: ['lm_head.bias', 'lm_head.decoder.bias', 'lm_head.decoder.weight', 'lm_head.dense.bias', 'lm_head.dense.weight', 'lm_head.layer_norm.bias', 'lm_head.layer_norm.weight']
282
+ - This IS expected if you are initializing RobertaForTokenClassification from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
283
+ - This IS NOT expected if you are initializing RobertaForTokenClassification from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).
284
+ [WARNING|modeling_utils.py:4172] 2024-05-13 11:33:30,135 >> Some weights of RobertaForTokenClassification were not initialized from the model checkpoint at PlanTL-GOB-ES/bsc-bio-ehr-es and are newly initialized: ['classifier.bias', 'classifier.weight']
285
+ You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.
286
+
287
+ /content/dissertation/scripts/ner/run_ner.py:397: FutureWarning: load_metric is deprecated and will be removed in the next major version of datasets. Use 'evaluate.load' instead, from the new library πŸ€— Evaluate: https://huggingface.co/docs/evaluate
288
+ metric = load_metric("seqeval")
289
+ /usr/local/lib/python3.10/dist-packages/datasets/load.py:759: FutureWarning: The repository for seqeval contains custom code which must be executed to correctly load the metric. You can inspect the repository content at https://raw.githubusercontent.com/huggingface/datasets/2.19.1/metrics/seqeval/seqeval.py
290
+ You can avoid this message in future by passing the argument `trust_remote_code=True`.
291
+ Passing `trust_remote_code=True` will be mandatory to load this metric from the next major release of `datasets`.
292
+ warnings.warn(
293
+ 05/13/2024 11:33:32 - INFO - __main__ - *** Evaluate ***
294
+ [INFO|trainer.py:786] 2024-05-13 11:33:32,238 >> The following columns in the evaluation set don't have a corresponding argument in `RobertaForTokenClassification.forward` and have been ignored: ner_tags, tokens, id. If ner_tags, tokens, id are not expected by `RobertaForTokenClassification.forward`, you can safely ignore this message.
295
+ [INFO|trainer.py:3614] 2024-05-13 11:33:32,243 >> ***** Running Evaluation *****
296
+ [INFO|trainer.py:3616] 2024-05-13 11:33:32,243 >> Num examples = 6807
297
+ [INFO|trainer.py:3619] 2024-05-13 11:33:32,243 >> Batch size = 8
298
+
299
  0%| | 0/851 [00:00<?, ?it/s]
300
  1%| | 7/851 [00:00<00:12, 69.99it/s]
301
  2%|▏ | 15/851 [00:00<00:11, 72.61it/s]
302
  3%|β–Ž | 23/851 [00:00<00:11, 71.79it/s]
303
  4%|β–Ž | 31/851 [00:00<00:11, 71.34it/s]
304
  5%|▍ | 39/851 [00:00<00:11, 73.33it/s]
305
  6%|β–Œ | 47/851 [00:00<00:10, 74.03it/s]
306
  6%|β–‹ | 55/851 [00:00<00:10, 75.21it/s]
307
  7%|β–‹ | 63/851 [00:00<00:10, 75.11it/s]
308
  8%|β–Š | 71/851 [00:00<00:11, 70.20it/s]
309
  9%|β–‰ | 79/851 [00:01<00:10, 71.93it/s]
310
  10%|β–ˆ | 87/851 [00:01<00:10, 72.37it/s]
311
  11%|β–ˆ | 95/851 [00:01<00:10, 70.10it/s]
312
  12%|β–ˆβ– | 103/851 [00:01<00:10, 72.17it/s]
313
  13%|β–ˆβ–Ž | 111/851 [00:01<00:10, 71.21it/s]
314
  14%|β–ˆβ– | 119/851 [00:01<00:10, 72.24it/s]
315
  15%|β–ˆβ– | 127/851 [00:01<00:10, 68.05it/s]
316
  16%|β–ˆβ–Œ | 135/851 [00:01<00:10, 69.82it/s]
317
  17%|β–ˆβ–‹ | 143/851 [00:02<00:10, 69.72it/s]
318
  18%|β–ˆβ–Š | 151/851 [00:02<00:10, 66.80it/s]
319
  19%|β–ˆβ–‰ | 160/851 [00:02<00:09, 71.09it/s]
320
  20%|β–ˆβ–‰ | 168/851 [00:02<00:09, 71.81it/s]
321
  21%|β–ˆβ–ˆ | 176/851 [00:02<00:09, 73.08it/s]
322
  22%|β–ˆβ–ˆβ– | 184/851 [00:02<00:09, 73.12it/s]
323
  23%|β–ˆβ–ˆβ–Ž | 192/851 [00:02<00:08, 74.39it/s]
324
  24%|β–ˆβ–ˆβ–Ž | 200/851 [00:02<00:08, 75.94it/s]
325
  24%|β–ˆβ–ˆβ– | 208/851 [00:02<00:08, 73.49it/s]
326
  25%|β–ˆβ–ˆβ–Œ | 216/851 [00:03<00:08, 71.64it/s]
327
  26%|β–ˆβ–ˆβ–‹ | 224/851 [00:03<00:08, 71.02it/s]
328
  27%|β–ˆβ–ˆβ–‹ | 232/851 [00:03<00:08, 73.33it/s]
329
  28%|β–ˆβ–ˆβ–Š | 240/851 [00:03<00:08, 71.22it/s]
330
  29%|β–ˆβ–ˆβ–‰ | 248/851 [00:03<00:08, 69.25it/s]
331
  30%|β–ˆβ–ˆβ–ˆ | 257/851 [00:03<00:08, 72.74it/s]
332
  31%|β–ˆβ–ˆβ–ˆβ– | 266/851 [00:03<00:07, 75.27it/s]
333
  32%|β–ˆβ–ˆβ–ˆβ– | 274/851 [00:03<00:07, 73.53it/s]
334
  33%|β–ˆβ–ˆβ–ˆβ–Ž | 283/851 [00:03<00:07, 75.83it/s]
335
  34%|β–ˆβ–ˆβ–ˆβ– | 291/851 [00:04<00:07, 74.19it/s]
336
  35%|β–ˆβ–ˆβ–ˆβ–Œ | 299/851 [00:04<00:07, 75.63it/s]
337
  36%|β–ˆβ–ˆβ–ˆβ–Œ | 308/851 [00:04<00:07, 77.05it/s]
338
  37%|β–ˆβ–ˆβ–ˆβ–‹ | 316/851 [00:04<00:07, 72.54it/s]
339
  38%|β–ˆβ–ˆβ–ˆβ–Š | 324/851 [00:04<00:07, 73.26it/s]
340
  39%|β–ˆβ–ˆβ–ˆβ–‰ | 332/851 [00:04<00:07, 72.27it/s]
341
  40%|β–ˆβ–ˆβ–ˆβ–‰ | 340/851 [00:04<00:06, 73.29it/s]
342
  41%|β–ˆβ–ˆβ–ˆβ–ˆ | 348/851 [00:04<00:06, 74.09it/s]
343
  42%|β–ˆβ–ˆβ–ˆβ–ˆβ– | 356/851 [00:04<00:06, 71.17it/s]
344
  43%|β–ˆβ–ˆβ–ˆβ–ˆβ–Ž | 364/851 [00:05<00:06, 71.16it/s]
345
  44%|β–ˆβ–ˆβ–ˆβ–ˆβ–Ž | 372/851 [00:05<00:06, 70.58it/s]
346
  45%|β–ˆβ–ˆβ–ˆβ–ˆβ– | 380/851 [00:05<00:06, 67.86it/s]
347
  46%|β–ˆβ–ˆβ–ˆβ–ˆβ–Œ | 388/851 [00:05<00:06, 70.57it/s]
348
  47%|β–ˆβ–ˆβ–ˆβ–ˆβ–‹ | 396/851 [00:05<00:06, 71.36it/s]
349
  47%|β–ˆβ–ˆβ–ˆβ–ˆβ–‹ | 404/851 [00:05<00:06, 66.91it/s]
350
  48%|β–ˆβ–ˆβ–ˆβ–ˆβ–Š | 412/851 [00:05<00:06, 67.91it/s]
351
  49%|β–ˆβ–ˆβ–ˆβ–ˆβ–‰ | 420/851 [00:05<00:06, 70.89it/s]
352
  50%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆ | 428/851 [00:05<00:06, 69.31it/s]
353
  51%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆ | 436/851 [00:06<00:05, 71.73it/s]
354
  52%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 444/851 [00:06<00:05, 72.78it/s]
355
  53%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Ž | 452/851 [00:06<00:05, 72.01it/s]
356
  54%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 460/851 [00:06<00:05, 71.19it/s]
357
  55%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 468/851 [00:06<00:05, 68.93it/s]
358
  56%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Œ | 475/851 [00:06<00:05, 64.78it/s]
359
  57%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‹ | 482/851 [00:06<00:05, 65.43it/s]
360
  57%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‹ | 489/851 [00:06<00:05, 66.13it/s]
361
  59%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Š | 498/851 [00:06<00:04, 70.86it/s]
362
  59%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‰ | 506/851 [00:07<00:04, 73.13it/s]
363
  60%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ | 514/851 [00:07<00:04, 72.24it/s]
364
  61%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 522/851 [00:07<00:04, 67.88it/s]
365
  62%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 529/851 [00:07<00:04, 67.10it/s]
366
  63%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Ž | 537/851 [00:07<00:04, 69.73it/s]
367
  64%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 545/851 [00:07<00:04, 70.97it/s]
368
  65%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 553/851 [00:07<00:04, 68.72it/s]
369
  66%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Œ | 561/851 [00:07<00:04, 71.78it/s]
370
  67%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‹ | 569/851 [00:07<00:03, 73.74it/s]
371
  68%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Š | 577/851 [00:08<00:03, 74.03it/s]
372
  69%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Š | 585/851 [00:08<00:03, 69.31it/s]
373
  70%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‰ | 593/851 [00:08<00:03, 68.71it/s]
374
  71%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ | 601/851 [00:08<00:03, 69.53it/s]
375
  72%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 609/851 [00:08<00:03, 69.71it/s]
376
  73%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Ž | 617/851 [00:08<00:03, 65.18it/s]
377
  73%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Ž | 625/851 [00:08<00:03, 67.75it/s]
378
  74%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 632/851 [00:08<00:03, 66.40it/s]
379
  75%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Œ | 639/851 [00:09<00:03, 66.73it/s]
380
  76%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Œ | 646/851 [00:09<00:03, 64.45it/s]
381
  77%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‹ | 654/851 [00:09<00:02, 67.81it/s]
382
  78%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Š | 662/851 [00:09<00:02, 69.27it/s]
383
  79%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Š | 670/851 [00:09<00:02, 70.30it/s]
384
  80%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‰ | 678/851 [00:09<00:02, 70.89it/s]
385
  81%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ | 686/851 [00:09<00:02, 71.36it/s]
386
  82%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 694/851 [00:09<00:02, 73.14it/s]
387
  82%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 702/851 [00:09<00:01, 74.61it/s]
388
  83%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Ž | 710/851 [00:09<00:01, 76.06it/s]
389
  84%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 718/851 [00:10<00:01, 73.20it/s]
390
  85%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Œ | 726/851 [00:10<00:01, 74.44it/s]
391
  86%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‹ | 734/851 [00:10<00:01, 73.01it/s]
392
  87%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‹ | 742/851 [00:10<00:01, 74.39it/s]
393
  88%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Š | 750/851 [00:10<00:01, 74.30it/s]
394
  89%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‰ | 758/851 [00:10<00:01, 74.83it/s]
395
  90%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ | 766/851 [00:10<00:01, 71.22it/s]
396
  91%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ | 774/851 [00:10<00:01, 71.36it/s]
397
  92%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–| 782/851 [00:10<00:00, 69.72it/s]
398
  93%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Ž| 790/851 [00:11<00:00, 70.57it/s]
399
  94%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–| 798/851 [00:11<00:00, 71.35it/s]
400
  95%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–| 806/851 [00:11<00:00, 73.28it/s]
401
  96%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Œ| 814/851 [00:11<00:00, 69.76it/s]
402
  97%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‹| 822/851 [00:11<00:00, 71.00it/s]
403
  98%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Š| 830/851 [00:11<00:00, 71.19it/s]
404
  98%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Š| 838/851 [00:11<00:00, 71.54it/s]
405
  99%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‰| 846/851 [00:11<00:00, 67.56it/s]/usr/local/lib/python3.10/dist-packages/seqeval/metrics/v1.py:57: UndefinedMetricWarning: Recall and F-score are ill-defined and being set to 0.0 in labels with no true samples. Use `zero_division` parameter to control this behavior.
406
+ _warn_prf(average, modifier, msg_start, len(result))
407
+
408
+ ***** eval metrics *****
409
+ eval_accuracy = 0.0187
410
+ eval_f1 = 0.0088
411
+ eval_loss = 2.3735
412
+ eval_precision = 0.0045
413
+ eval_recall = 0.1273
414
+ eval_runtime = 0:00:16.93
415
+ eval_samples = 6807
416
+ eval_samples_per_second = 401.939
417
+ eval_steps_per_second = 50.25
418
+ 05/13/2024 11:33:49 - INFO - __main__ - *** Predict ***
419
+ [INFO|trainer.py:786] 2024-05-13 11:33:49,182 >> The following columns in the test set don't have a corresponding argument in `RobertaForTokenClassification.forward` and have been ignored: ner_tags, tokens, id. If ner_tags, tokens, id are not expected by `RobertaForTokenClassification.forward`, you can safely ignore this message.
420
+ [INFO|trainer.py:3614] 2024-05-13 11:33:49,184 >> ***** Running Prediction *****
421
+ [INFO|trainer.py:3616] 2024-05-13 11:33:49,185 >> Num examples = 6807
422
+ [INFO|trainer.py:3619] 2024-05-13 11:33:49,185 >> Batch size = 8
423
+
424
  0%| | 0/851 [00:00<?, ?it/s]
425
  1%| | 10/851 [00:00<00:09, 93.41it/s]
426
  2%|▏ | 20/851 [00:00<00:10, 80.25it/s]
427
  3%|β–Ž | 29/851 [00:00<00:10, 77.17it/s]
428
  4%|▍ | 37/851 [00:00<00:10, 75.63it/s]
429
  5%|β–Œ | 45/851 [00:00<00:10, 75.87it/s]
430
  6%|β–Œ | 53/851 [00:00<00:10, 76.47it/s]
431
  7%|β–‹ | 62/851 [00:00<00:10, 77.26it/s]
432
  8%|β–Š | 70/851 [00:00<00:10, 71.62it/s]
433
  9%|β–‰ | 78/851 [00:01<00:10, 71.87it/s]
434
  10%|β–ˆ | 86/851 [00:01<00:10, 72.52it/s]
435
  11%|β–ˆ | 94/851 [00:01<00:10, 70.09it/s]
436
  12%|β–ˆβ– | 102/851 [00:01<00:10, 71.95it/s]
437
  13%|β–ˆβ–Ž | 110/851 [00:01<00:10, 71.06it/s]
438
  14%|β–ˆβ– | 118/851 [00:01<00:10, 72.15it/s]
439
  15%|β–ˆβ– | 126/851 [00:01<00:10, 68.63it/s]
440
  16%|β–ˆβ–Œ | 133/851 [00:01<00:10, 68.94it/s]
441
  16%|β–ˆβ–‹ | 140/851 [00:01<00:10, 68.78it/s]
442
  17%|β–ˆβ–‹ | 148/851 [00:02<00:10, 67.78it/s]
443
  18%|β–ˆβ–Š | 156/851 [00:02<00:09, 70.02it/s]
444
  19%|β–ˆβ–‰ | 164/851 [00:02<00:09, 70.66it/s]
445
  20%|β–ˆβ–ˆ | 172/851 [00:02<00:09, 72.98it/s]
446
  21%|β–ˆβ–ˆ | 180/851 [00:02<00:09, 73.53it/s]
447
  22%|β–ˆβ–ˆβ– | 188/851 [00:02<00:08, 74.03it/s]
448
  23%|β–ˆβ–ˆβ–Ž | 196/851 [00:02<00:08, 75.22it/s]
449
  24%|β–ˆβ–ˆβ– | 204/851 [00:02<00:08, 76.42it/s]
450
  25%|β–ˆβ–ˆβ– | 212/851 [00:02<00:08, 71.76it/s]
451
  26%|β–ˆβ–ˆβ–Œ | 220/851 [00:03<00:08, 70.32it/s]
452
  27%|β–ˆβ–ˆβ–‹ | 228/851 [00:03<00:08, 71.73it/s]
453
  28%|β–ˆβ–ˆβ–Š | 236/851 [00:03<00:08, 71.56it/s]
454
  29%|β–ˆβ–ˆβ–Š | 244/851 [00:03<00:08, 67.69it/s]
455
  30%|β–ˆβ–ˆβ–‰ | 253/851 [00:03<00:08, 71.63it/s]
456
  31%|β–ˆβ–ˆβ–ˆ | 262/851 [00:03<00:07, 74.49it/s]
457
  32%|β–ˆβ–ˆβ–ˆβ– | 270/851 [00:03<00:07, 72.98it/s]
458
  33%|β–ˆβ–ˆβ–ˆβ–Ž | 279/851 [00:03<00:07, 75.66it/s]
459
  34%|β–ˆβ–ˆβ–ˆβ–Ž | 287/851 [00:03<00:07, 74.53it/s]
460
  35%|β–ˆβ–ˆβ–ˆβ– | 295/851 [00:04<00:07, 74.21it/s]
461
  36%|β–ˆβ–ˆβ–ˆβ–Œ | 304/851 [00:04<00:07, 76.50it/s]
462
  37%|β–ˆβ–ˆβ–ˆβ–‹ | 312/851 [00:04<00:07, 72.50it/s]
463
  38%|β–ˆβ–ˆβ–ˆβ–Š | 321/851 [00:04<00:06, 75.74it/s]
464
  39%|β–ˆβ–ˆβ–ˆβ–Š | 329/851 [00:04<00:07, 74.05it/s]
465
  40%|β–ˆβ–ˆβ–ˆβ–‰ | 337/851 [00:04<00:06, 73.96it/s]
466
  41%|β–ˆβ–ˆβ–ˆβ–ˆ | 346/851 [00:04<00:06, 74.14it/s]
467
  42%|β–ˆβ–ˆβ–ˆβ–ˆβ– | 354/851 [00:04<00:06, 71.38it/s]
468
  43%|β–ˆβ–ˆβ–ˆβ–ˆβ–Ž | 362/851 [00:04<00:06, 70.84it/s]
469
  43%|β–ˆβ–ˆβ–ˆβ–ˆβ–Ž | 370/851 [00:05<00:06, 70.89it/s]
470
  44%|β–ˆβ–ˆβ–ˆβ–ˆβ– | 378/851 [00:05<00:06, 70.69it/s]
471
  45%|β–ˆβ–ˆβ–ˆβ–ˆβ–Œ | 386/851 [00:05<00:06, 70.06it/s]
472
  46%|β–ˆβ–ˆβ–ˆβ–ˆβ–‹ | 394/851 [00:05<00:06, 70.15it/s]
473
  47%|β–ˆβ–ˆβ–ˆβ–ˆβ–‹ | 402/851 [00:05<00:06, 70.60it/s]
474
  48%|β–ˆβ–ˆβ–ˆβ–ˆβ–Š | 410/851 [00:05<00:06, 67.26it/s]
475
  49%|β–ˆβ–ˆβ–ˆβ–ˆβ–‰ | 418/851 [00:05<00:06, 69.32it/s]
476
  50%|β–ˆβ–ˆβ–ˆβ–ˆβ–‰ | 425/851 [00:05<00:06, 67.83it/s]
477
  51%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆ | 433/851 [00:05<00:05, 70.62it/s]
478
  52%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 441/851 [00:06<00:05, 72.80it/s]
479
  53%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Ž | 449/851 [00:06<00:05, 71.68it/s]
480
  54%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Ž | 457/851 [00:06<00:05, 72.72it/s]
481
  55%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 465/851 [00:06<00:05, 69.67it/s]
482
  56%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Œ | 473/851 [00:06<00:05, 63.94it/s]
483
  57%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‹ | 481/851 [00:06<00:05, 65.25it/s]
484
  57%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‹ | 489/851 [00:06<00:05, 66.10it/s]
485
  59%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Š | 498/851 [00:06<00:04, 70.83it/s]
486
  59%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‰ | 506/851 [00:07<00:04, 73.00it/s]
487
  60%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ | 514/851 [00:07<00:04, 72.47it/s]
488
  61%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 522/851 [00:07<00:04, 68.53it/s]
489
  62%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 529/851 [00:07<00:04, 67.57it/s]
490
  63%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Ž | 537/851 [00:07<00:04, 69.88it/s]
491
  64%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 545/851 [00:07<00:04, 70.84it/s]
492
  65%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 553/851 [00:07<00:04, 68.71it/s]
493
  66%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Œ | 562/851 [00:07<00:03, 72.27it/s]
494
  67%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‹ | 570/851 [00:07<00:03, 73.35it/s]
495
  68%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Š | 578/851 [00:08<00:03, 71.63it/s]
496
  69%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‰ | 586/851 [00:08<00:03, 69.61it/s]
497
  70%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‰ | 593/851 [00:08<00:03, 69.40it/s]
498
  71%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ | 601/851 [00:08<00:03, 70.17it/s]
499
  72%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 609/851 [00:08<00:03, 70.14it/s]
500
  73%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Ž | 617/851 [00:08<00:03, 65.67it/s]
501
  73%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Ž | 625/851 [00:08<00:03, 68.06it/s]
502
  74%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 632/851 [00:08<00:03, 65.68it/s]
503
  75%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Œ | 639/851 [00:08<00:03, 66.01it/s]
504
  76%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Œ | 646/851 [00:09<00:03, 63.53it/s]
505
  77%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‹ | 654/851 [00:09<00:02, 67.46it/s]
506
  78%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Š | 662/851 [00:09<00:02, 68.21it/s]
507
  79%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Š | 670/851 [00:09<00:02, 69.59it/s]
508
  80%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‰ | 678/851 [00:09<00:02, 70.67it/s]
509
  81%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ | 686/851 [00:09<00:02, 71.63it/s]
510
  82%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 694/851 [00:09<00:02, 73.38it/s]
511
  82%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 702/851 [00:09<00:01, 74.51it/s]
512
  83%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Ž | 710/851 [00:09<00:01, 75.85it/s]
513
  84%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ– | 718/851 [00:10<00:01, 73.42it/s]
514
  85%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Œ | 726/851 [00:10<00:01, 74.64it/s]
515
  86%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‹ | 734/851 [00:10<00:01, 75.67it/s]
516
  87%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‹ | 742/851 [00:10<00:01, 75.71it/s]
517
  88%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Š | 750/851 [00:10<00:01, 75.06it/s]
518
  89%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‰ | 758/851 [00:10<00:01, 75.32it/s]
519
  90%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ | 766/851 [00:10<00:01, 71.14it/s]
520
  91%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ | 774/851 [00:10<00:01, 71.13it/s]
521
  92%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–| 782/851 [00:10<00:00, 69.77it/s]
522
  93%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Ž| 790/851 [00:11<00:00, 70.89it/s]
523
  94%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–| 798/851 [00:11<00:00, 71.65it/s]
524
  95%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–| 807/851 [00:11<00:00, 74.32it/s]
525
  96%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Œ| 815/851 [00:11<00:00, 71.09it/s]
526
  97%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‹| 823/851 [00:11<00:00, 71.28it/s]
527
  98%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Š| 831/851 [00:11<00:00, 71.56it/s]
528
  99%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–Š| 839/851 [00:11<00:00, 70.65it/s]
529
+ [INFO|trainer.py:3305] 2024-05-13 11:34:05,686 >> Saving model checkpoint to /content/dissertation/scripts/ner/output
530
+ [INFO|configuration_utils.py:471] 2024-05-13 11:34:05,688 >> Configuration saved in /content/dissertation/scripts/ner/output/config.json
531
+ [INFO|modeling_utils.py:2590] 2024-05-13 11:34:06,653 >> Model weights saved in /content/dissertation/scripts/ner/output/model.safetensors
532
+ [INFO|tokenization_utils_base.py:2488] 2024-05-13 11:34:06,654 >> tokenizer config file saved in /content/dissertation/scripts/ner/output/tokenizer_config.json
533
+ [INFO|tokenization_utils_base.py:2497] 2024-05-13 11:34:06,655 >> Special tokens file saved in /content/dissertation/scripts/ner/output/special_tokens_map.json
534
+ [INFO|modelcard.py:450] 2024-05-13 11:34:06,959 >> Dropping the following result as it does not have all the necessary fields:
535
+ {'task': {'name': 'Token Classification', 'type': 'token-classification'}, 'dataset': {'name': 'Rodrigo1771/multi-train-distemist-dev-ner', 'type': 'Rodrigo1771/multi-train-distemist-dev-ner', 'config': 'MultiTrainDisTEMISTDevNER', 'split': 'validation', 'args': 'MultiTrainDisTEMISTDevNER'}}
536
+ ***** predict metrics *****
537
+ predict_accuracy = 0.0187
538
+ predict_f1 = 0.0088
539
+ predict_loss = 2.3735
540
+ predict_precision = 0.0045
541
+ predict_recall = 0.1273
542
+ predict_runtime = 0:00:16.17
543
+ predict_samples_per_second = 420.719
544
+ predict_steps_per_second = 52.598
545
+
546
+
547
+
548
+
549
+
550
+
551
+
552
+
553
+
554
+
555
+
556
+
557
+
558
+
559
+
560
+
561
+
562
+
563
+
564
+
565
+
566
+
567
+
568
+
569
+
training_args.bin ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:18cce49dc172023921a8c234d2d643afeaa0b4f8f209fd2e1d76d267cbbe3c95
3
+ size 5048
vocab.json ADDED
The diff for this file is too large to render. See raw diff