## Albert xxlarge version 1 language model fine-tuned on SQuAD2.0 | |
### (updated 30Sept2020) with the following results: | |
``` | |
exact: 86.11134506864315 | |
f1: 89.35371214945009 | |
total': 11873 | |
HasAns_exact': 83.56950067476383 | |
HasAns_f1': 90.06353312254078 | |
HasAns_total': 5928 | |
NoAns_exact': 88.64592094196804 | |
NoAns_f1': 88.64592094196804 | |
NoAns_total': 5945 | |
best_exact': 86.11134506864315 | |
best_exact_thresh': 0.0 | |
best_f1': 89.35371214944985 | |
best_f1_thresh': 0.0 | |
``` | |
### from script: | |
``` | |
python ${EXAMPLES}/run_squad.py \ | |
--model_type albert \ | |
--model_name_or_path albert-xxlarge-v1 \ | |
--do_train \ | |
--do_eval \ | |
--train_file ${SQUAD}/train-v2.0.json \ | |
--predict_file ${SQUAD}/dev-v2.0.json \ | |
--version_2_with_negative \ | |
--do_lower_case \ | |
--num_train_epochs 3 \ | |
--max_steps 8144 \ | |
--warmup_steps 814 \ | |
--learning_rate 3e-5 \ | |
--max_seq_length 512 \ | |
--doc_stride 128 \ | |
--per_gpu_train_batch_size 6 \ | |
--gradient_accumulation_steps 8 \ | |
--per_gpu_eval_batch_size 48 \ | |
--fp16 \ | |
--fp16_opt_level O1 \ | |
--threads 12 \ | |
--logging_steps 50 \ | |
--save_steps 3000 \ | |
--overwrite_output_dir \ | |
--output_dir ${MODEL_PATH} | |
``` | |
### using the following software & system: | |
``` | |
Transformers: 3.1.0 | |
PyTorch: 1.6.0 | |
TensorFlow: 2.3.1 | |
Python: 3.8.1 | |
OS: Linux-5.4.0-48-generic-x86_64-with-glibc2.10 | |
CPU/GPU: Intel i9-9900K / NVIDIA Titan RTX 24GB | |
``` | |