--- datasets: - MIT Movie language: - English thumbnail: tags: - roberta - roberta-base - token-classification - NER - named-entities - BIO - movies license: cc-by-4.0 --- # roberta-base + Movies NER Task Objective: This is Roberta Base trained for the NER task using MIT Movie Dataset ``` model_name = "thatdramebaazguy/roberta-base-MITmovie" pipeline(model=model_name, tokenizer=model_name, revision="v1.0", task="ner") ``` ## Overview **Language model:** roberta-base **Language:** English **Downstream-task:** NER **Training data:** MIT Movie **Eval data:** MIT Movie **Infrastructure**: 2x Tesla v100 **Code:** See [example](https://github.com/adityaarunsinghal/Domain-Adaptation/blob/master/scripts/shell_scripts/movieR_NER_squad.sh) ## Hyperparameters ``` Num examples = 6253 Num Epochs = 5 Instantaneous batch size per device = 64 Total train batch size (w. parallel, distributed & accumulation) = 128 ``` ## Performance ### Eval on MIT Movie - epoch = 5.0 - eval_accuracy = 0.9476 - eval_f1 = 0.8853 - eval_loss = 0.2208 - eval_mem_cpu_alloc_delta = 17MB - eval_mem_cpu_peaked_delta = 2MB - eval_mem_gpu_alloc_delta = 0MB - eval_mem_gpu_peaked_delta = 38MB - eval_precision = 0.8833 - eval_recall = 0.8874 - eval_runtime = 0:00:03.62 - eval_samples = 1955 Github Repo: - [Domain-Adaptation Project](https://github.com/adityaarunsinghal/Domain-Adaptation/) ---