File size: 1,541 Bytes
f462fe3
 
 
 
cc2c908
 
f462fe3
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
cc2c908
 
f462fe3
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
---
datasets:
- imdb
- cornell_movie_dialogue 
- polarity_movie_data
- 25mlens_movie_data

language: 
- English

thumbnail: 

tags:
- roberta
- roberta-base
- masked-language-modeling 
- masked-lm

license: cc-by-4.0

---
# roberta-base for MLM 

Objective: To make a Roberta Base for the Movie Domain by using various Movie Datasets as simple text for Masked Language Modeling. 
  This is the Movie Roberta to be used in Movie Domain applications. 
```
model_name = "thatdramebaazguy/movie-roberta-base"
pipeline(model=model_name, tokenizer=model_name, revision="v1.0", task="Fill-Mask")
```
## Overview
**Language model:** roberta-base  
**Language:** English  
**Downstream-task:** Fill-Mask  
**Training data:** imdb, polarity movie data, cornell_movie_dialogue, 25mlens movie names  
**Eval data:** imdb, polarity movie data, cornell_movie_dialogue, 25mlens movie names    
**Infrastructure**: 4x Tesla v100   
**Code:**  See [example](https://github.com/adityaarunsinghal/Domain-Adaptation/blob/master/scripts/shell_scripts/train_movie_roberta.sh)    

## Hyperparameters
```
Num examples = 4767233
Num Epochs = 2
Instantaneous batch size per device = 20
Total train batch size (w. parallel, distributed & accumulation) = 80
Gradient Accumulation steps = 1
Total optimization steps = 119182
eval_loss  = 1.6153
eval_samples = 20573
perplexity = 5.0296
learning_rate=5e-05
n_gpu = 4

``` 
## Performance

perplexity = 5.0296

Some of my work: 
- [Domain-Adaptation Project](https://github.com/adityaarunsinghal/Domain-Adaptation/)

---