File size: 1,822 Bytes
dbc87cf
52bf461
 
 
 
 
 
 
 
 
9754f15
88ddc46
9754f15
7463287
9754f15
 
 
 
ff715d8
 
5e8da75
9754f15
 
 
 
 
 
 
 
8fd5d98
 
 
9754f15
8fd5d98
9754f15
 
 
8fd5d98
 
9754f15
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
---
license: cc
widget:
  - text: "Movie: Parasite Score:"
    example_title: "Parasite"
  - text: "Movie: Come and See Score:"
    example_title: "Come and See"
  - text: "Movie: Harakiri Score:"
    example_title: "Harakiri"
---

# Review Training Bot

This model was trained for the purpose of generating scores and reviews for any given movie. It is fine-tuned on distilgpt2 as a baseline and trained on a custom dataset created by scraping around 120k letterboxd reviews. The current state of the model can get the correct formatting reliably but oftentimes is prone to gibberish. Further training will hopefully add coherency. It is in version 0.1 currently.


## Intended uses & limitations

This model is intended to be used for entertainment. 

Limitations for this model will be much of the same as distilgpt2 which can be viewed here https://huggingface.co/distilgpt2. These may include persistent biases. Another issue may be through language specifically on letterboxd that the algorithm may not be able to understand. i.e. an LGBT+ film on letterboxd may have multiple reviews that mention the word "gay" positively, this model has not been able to understand this contextual usage and will use the word as a slur. As the current model also struggles to find a connection between movie titles and the reviews, this could happen with any entered movie.



## Training procedure

### Training hyperparameters

The following hyperparameters were used during training:
- learning_rate: 0.001
- train_batch_size: 10
- eval_batch_size: 20
- seed: 42
- distributed_type: multi-GPU
- optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
- lr_scheduler_type: linear
- lr_scheduler_warmup_steps: 100
- training_steps: 5000


### Framework versions

- Transformers 4.21.2
- Pytorch 1.12.1+cu113
- Tokenizers 0.12.1