saattrupdan commited on
Commit
0deae3c
1 Parent(s): 03ceca6

Create README.md

Browse files
Files changed (1) hide show
  1. README.md +40 -0
README.md ADDED
@@ -0,0 +1,40 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ license: mit
3
+ model-index:
4
+ - name: electra-small-offensive-text-detection-da
5
+ results: []
6
+ widget:
7
+ - text: "Din store idiot"
8
+ ---
9
+
10
+ # Danish Offensive Text Detection based on ELECTRA-small
11
+
12
+ This model is a fine-tuned version of [Maltehb/aelaectra-danish-electra-small-cased](https://huggingface.co/Maltehb/aelaectra-danish-electra-small-cased) on a dataset consisting of approximately 5 million Facebook comments on [DR's](https://dr.dk/) public Facebook pages. The labels have been automatically generated using weak supervision, based on the [Snorkel](https://www.snorkel.org/) framework.
13
+
14
+ The model has been evaluated on a test set consisting of 500 samples.
15
+
16
+ ## Training procedure
17
+
18
+ ### Training hyperparameters
19
+
20
+ The following hyperparameters were used during training:
21
+ - learning_rate: 2e-05
22
+ - train_batch_size: 32
23
+ - eval_batch_size: 32
24
+ - gradient_accumulation_steps: 1
25
+ - total_train_batch_size: 32
26
+ - seed: 4242
27
+ - optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
28
+ - lr_scheduler_type: linear
29
+ - max_steps: 500000
30
+ - fp16: True
31
+ - eval_steps: 1000
32
+ - early_stopping_patience: 100
33
+
34
+
35
+ ### Framework versions
36
+
37
+ - Transformers 4.20.1
38
+ - Pytorch 1.11.0+cu113
39
+ - Datasets 2.3.2
40
+ - Tokenizers 0.12.1