vuiseng9
/

roberta-l-squadv1.1

Question Answering

Generated from Trainer

Inference Endpoints

Model card Files Files and versions Community

roberta-l-squadv1.1 / README.md

Vui Seng Chua

add content

270b51c about 2 years ago

|

raw history blame

No virus

2.12 kB

	---
	license: mit
	tags:
	- generated_from_trainer
	datasets:
	- squad
	model-index:
	- name: run05-roberta-large-squadv1.1-sl384-ds128-e2-tbs16
	results: []
	---

	<!-- This model card has been generated automatically according to the information the Trainer had access to. You
	should probably proofread and complete it, then remove this comment. -->

	# run05-roberta-large-squadv1.1-sl384-ds128-e2-tbs16

	This model is a fine-tuned version of [roberta-large](https://huggingface.co/roberta-large) on the squad dataset.

	## Model description

	More information needed

	## Intended uses & limitations

	More information needed

	## Training and evaluation data

	More information needed

	## Training procedure

	### Training hyperparameters

	The following hyperparameters were used during training:
	- learning_rate: 3e-05
	- train_batch_size: 16
	- eval_batch_size: 64
	- seed: 42
	- optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
	- lr_scheduler_type: linear
	- num_epochs: 2.0
	- mixed_precision_training: Native AMP

	### Training results



	### Framework versions

	- Transformers 4.18.0
	- Pytorch 1.11.0+cu113
	- Datasets 2.1.0
	- Tokenizers 0.12.1

	# Train
	```bash
	python run_qa.py \
	--model_name_or_path roberta-large \
	--dataset_name squad \
	--do_eval \
	--do_train \
	--evaluation_strategy steps \
	--eval_steps 500 \
	--learning_rate 3e-5 \
	--fp16 \
	--num_train_epochs 2 \
	--per_device_eval_batch_size 64 \
	--per_device_train_batch_size 16 \
	--max_seq_length 384 \
	--doc_stride 128 \
	--save_steps 1000 \
	--logging_steps 1 \
	--overwrite_output_dir \
	--run_name $RUNID \
	--output_dir $OUTDIR
	```

	# Eval
	```bash
	export CUDA_VISIBLE_DEVICES=0

	MODEL=vuiseng9/roberta-l-squadv1.1
	OUTDIR=eval-$(basename $MODEL)
	WORKDIR=transformers/examples/pytorch/question-answering
	cd $WORKDIR

	nohup python run_qa.py \
	--model_name_or_path $MODEL \
	--dataset_name squad \
	--do_eval \
	--per_device_eval_batch_size 16 \
	--max_seq_length 384 \
	--doc_stride 128 \
	--overwrite_output_dir \
	--output_dir $OUTDIR 2>&1 \| tee $OUTDIR/run.log &
	```