abarbosa commited on
Commit
18f8c91
1 Parent(s): 251f554

first model aristo-roberta done

Browse files
README.md ADDED
@@ -0,0 +1,131 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ language: "english"
3
+ tags:
4
+ license: "mit"
5
+ datasets:
6
+ - race
7
+ - arc
8
+ metrics:
9
+ - accuracy
10
+ ---
11
+
12
+ # Roberta Large Fine Tuned on RACE
13
+
14
+ ## Model description
15
+
16
+ This model follows the implementation by Allen AI team about [Aristo Roberta V7 Model](https://leaderboard.allenai.org/arc/submission/blcotvl7rrltlue6bsv0) given in [ARC Challenge](https://leaderboard.allenai.org/arc/submissions/public)
17
+
18
+ #### How to use
19
+
20
+ ```python
21
+
22
+ import datasets
23
+ from transformers import RobertaTokenizer
24
+ from transformers import RobertaForMultipleChoice
25
+
26
+ tokenizer = RobertaTokenizer.from_pretrained(
27
+ "LIAMF-USP/aristo-roberta")
28
+ model = RobertaForMultipleChoice.from_pretrained(
29
+ "LIAMF-USP/aristo-roberta")
30
+ dataset = datasets.load_dataset(
31
+ "arc",,
32
+ split=["train", "validation", "test"],
33
+ )
34
+ training_examples = dataset[0]
35
+ evaluation_examples = dataset[1]
36
+ test_examples = dataset[2]
37
+
38
+ example=training_examples[0]
39
+ example_id = example["example_id"]
40
+ question = example["question"]
41
+ label_example = example["answer"]
42
+ options = example["options"]
43
+ if label_example in ["A", "B", "C", "D", "E"]:
44
+ label_map = {label: i for i, label in enumerate(
45
+ ["A", "B", "C", "D", "E"])}
46
+ elif label_example in ["1", "2", "3", "4", "5"]:
47
+ label_map = {label: i for i, label in enumerate(
48
+ ["1", "2", "3", "4", "5"])}
49
+ else:
50
+ print(f"{label_example} not found")
51
+ while len(options) < 5:
52
+ empty_option = {}
53
+ empty_option['option_context'] = ''
54
+ empty_option['option_text'] = ''
55
+ options.append(empty_option)
56
+ choices_inputs = []
57
+ for ending_idx, option in enumerate(options):
58
+ ending = option["option_text"]
59
+ context = option["option_context"]
60
+ if question.find("_") != -1:
61
+ # fill in the banks questions
62
+ question_option = question.replace("_", ending)
63
+ else:
64
+ question_option = question + " " + ending
65
+
66
+ inputs = tokenizer(
67
+ context,
68
+ question_option,
69
+ add_special_tokens=True,
70
+ max_length=MAX_SEQ_LENGTH,
71
+ padding="max_length",
72
+ truncation=True,
73
+ return_overflowing_tokens=False,
74
+ )
75
+
76
+ if "num_truncated_tokens" in inputs and inputs["num_truncated_tokens"] > 0:
77
+ logging.warning(f"Question: {example_id} with option {ending_idx} was truncated")
78
+ choices_inputs.append(inputs)
79
+ label = label_map[label_example]
80
+ input_ids = [x["input_ids"] for x in choices_inputs]
81
+ attention_mask = (
82
+ [x["attention_mask"] for x in choices_inputs]
83
+ # as the senteces follow the same structure, just one of them is
84
+ # necessary to check
85
+ if "attention_mask" in choices_inputs[0]
86
+ else None
87
+ )
88
+ example_encoded = {
89
+ "example_id": example_id,
90
+ "input_ids": input_ids,
91
+ "attention_mask": attention_mask,
92
+ "token_type_ids": token_type_ids,
93
+ "label": label
94
+
95
+ }
96
+ output = model(**example_encoded)
97
+ ```
98
+
99
+
100
+ ## Training data
101
+
102
+ the Training data was the same as proposed [here](https://leaderboard.allenai.org/arc/submission/blcotvl7rrltlue6bsv0)
103
+
104
+ The only diferrence was the hypeparameters of RACE fine tuned model, which were reported [here](https://huggingface.co/LIAMF-USP/roberta-large-finetuned-race#eval-results)
105
+
106
+ ## Training procedure
107
+
108
+ It was necessary to preprocess the data with a method that is exemplified for a single instance in the _How to use_ section. The used hyperparameters were the following:
109
+
110
+ | Hyperparameter | Value |
111
+ |:----:|:----:|
112
+ | adam_beta1 | 0.9 |
113
+ | adam_beta2 | 0.98 |
114
+ | adam_epsilon | 1.000e-8 |
115
+ | eval_batch_size | 16 |
116
+ | train_batch_size | 4 |
117
+ | fp16 | True |
118
+ | gradient_accumulation_steps | 4 |
119
+ | learning_rate | 0.00001 |
120
+ | warmup_steps | 67 |
121
+ | max_length | 256 |
122
+ | epochs | 2 |
123
+
124
+ The other parameters were the default ones from [Trainer](https://huggingface.co/transformers/main_classes/trainer.html) and [Trainer Arguments](https://huggingface.co/transformers/main_classes/trainer.html#trainingarguments)
125
+
126
+ ## Eval results:
127
+ | Dataset Acc | Challenge Test |
128
+ |:----:|:----:|
129
+ | | 64.249 |
130
+
131
+ **The model was trained with a TITAN RTX**
config.json ADDED
@@ -0,0 +1,25 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "_name_or_path": "/root/masters-project/aristo-roberta",
3
+ "architectures": [
4
+ "RobertaForMultipleChoice"
5
+ ],
6
+ "attention_probs_dropout_prob": 0.1,
7
+ "bos_token_id": 0,
8
+ "eos_token_id": 2,
9
+ "gradient_checkpointing": false,
10
+ "hidden_act": "gelu",
11
+ "hidden_dropout_prob": 0.1,
12
+ "hidden_size": 1024,
13
+ "initializer_range": 0.02,
14
+ "intermediate_size": 4096,
15
+ "layer_norm_eps": 1e-05,
16
+ "max_position_embeddings": 514,
17
+ "model_type": "roberta",
18
+ "num_attention_heads": 16,
19
+ "num_hidden_layers": 24,
20
+ "pad_token_id": 1,
21
+ "position_embedding_type": "absolute",
22
+ "total_flos": 1502266556732252160,
23
+ "type_vocab_size": 1,
24
+ "vocab_size": 50265
25
+ }
merges.txt ADDED
The diff for this file is too large to render. See raw diff
pytorch_model.bin ADDED
@@ -0,0 +1,3 @@
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:84b4be015199610ee05fe79c12509eeda5b999ea150a398ab3958967bdf73416
3
+ size 1421616585
special_tokens_map.json ADDED
@@ -0,0 +1 @@
 
1
+ {"bos_token": {"content": "<s>", "single_word": false, "lstrip": false, "rstrip": false, "normalized": true}, "eos_token": {"content": "</s>", "single_word": false, "lstrip": false, "rstrip": false, "normalized": true}, "unk_token": {"content": "<unk>", "single_word": false, "lstrip": false, "rstrip": false, "normalized": true}, "sep_token": {"content": "</s>", "single_word": false, "lstrip": false, "rstrip": false, "normalized": true}, "pad_token": {"content": "<pad>", "single_word": false, "lstrip": false, "rstrip": false, "normalized": true}, "cls_token": {"content": "<s>", "single_word": false, "lstrip": false, "rstrip": false, "normalized": true}, "mask_token": {"content": "<mask>", "single_word": false, "lstrip": true, "rstrip": false, "normalized": true}}
tf_model.h5 ADDED
@@ -0,0 +1,3 @@
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:8a6b5172ebeb1677bd8be2a747a81601f497856f21e670b10e45d12d2c1a1f74
3
+ size 1421961240
tokenizer_config.json ADDED
@@ -0,0 +1 @@
 
1
+ {"unk_token": {"content": "<unk>", "single_word": false, "lstrip": false, "rstrip": false, "normalized": true, "__type": "AddedToken"}, "bos_token": {"content": "<s>", "single_word": false, "lstrip": false, "rstrip": false, "normalized": true, "__type": "AddedToken"}, "eos_token": {"content": "</s>", "single_word": false, "lstrip": false, "rstrip": false, "normalized": true, "__type": "AddedToken"}, "add_prefix_space": false, "errors": "replace", "sep_token": {"content": "</s>", "single_word": false, "lstrip": false, "rstrip": false, "normalized": true, "__type": "AddedToken"}, "cls_token": {"content": "<s>", "single_word": false, "lstrip": false, "rstrip": false, "normalized": true, "__type": "AddedToken"}, "pad_token": {"content": "<pad>", "single_word": false, "lstrip": false, "rstrip": false, "normalized": true, "__type": "AddedToken"}, "mask_token": {"content": "<mask>", "single_word": false, "lstrip": true, "rstrip": false, "normalized": true, "__type": "AddedToken"}, "model_max_length": 512, "special_tokens_map_file": "./results_model/bert-base-uncased/special_tokens_map.json", "name_or_path": "LIAMF-USP/roberta-large-finetuned-race"}
training_args.bin ADDED
@@ -0,0 +1,3 @@
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:f16888460a9cb0ba9ef68f5eade7a345ea94798b2d91aeba8c7ca872ba77c421
3
+ size 1839
vocab.json ADDED
The diff for this file is too large to render. See raw diff