thatdramebaazguy
/

movie-roberta-squad

Question Answering

Inference Endpoints

Model card Files Files and versions Community

movie-roberta-squad / README.md

thatdramebaazguy's picture

thatdramebaazguy

Update README.md

00d41ff almost 2 years ago

|

raw history blame contribute delete

No virus

1.61 kB

	---
	datasets:
	- imdb
	- cornell_movie_dialogue
	- SQuAD

	language:
	- English

	thumbnail:

	tags:
	- roberta
	- roberta-base
	- question-answering
	- qa
	- movies

	license: cc-by-4.0

	---
	# roberta-base + DAPT + Domain-Specific QA

	Objective:
	This is Roberta Base with Domain Adaptive Pretraining on Movie Corpora --> Then a changed head to do the SQuAD Task. This makes a QA model capable of answering questions in the movie domain.
	https://huggingface.co/thatdramebaazguy/movie-roberta-base was used as the MovieRoberta.

	```
	model_name = "thatdramebaazguy/movie-roberta-squad"
	pipeline(model=model_name, tokenizer=model_name, revision="v1.0", task="question-answering")
	```

	## Overview
	Language model: roberta-base
	Language: English
	Downstream-task: QA
	Training data: imdb, polarity movie data, cornell_movie_dialogue, 25mlens movie names, SQuADv1
	Eval data: MoviesQA (From https://github.com/ibm-aur-nlp/domain-specific-QA)
	Infrastructure: 1x Tesla v100
	Code: See [example](https://github.com/adityaarunsinghal/Domain-Adaptation/blob/master/scripts/shell_scripts/train_movieR_just_squadv1.sh)

	## Hyperparameters
	```
	Num examples = 88567
	Num Epochs = 10
	Instantaneous batch size per device = 32
	Total train batch size (w. parallel, distributed & accumulation) = 32

	```
	## Performance

	### Eval on MoviesQA
	- eval_samples = 5032
	- exact_match = 51.64944
	- f1 = 65.53983

	### Eval on SQuADv1
	- exact_match = 81.23936
	- f1 = 89.27827

	Github Repo:
	- [Domain-Adaptation Project](https://github.com/adityaarunsinghal/Domain-Adaptation/)

	---