Nikola299 commited on
Commit
c68b60f
·
verified ·
1 Parent(s): 3e7e156

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +46 -36
README.md CHANGED
@@ -1,57 +1,67 @@
1
  ---
 
 
2
  license: apache-2.0
3
- base_model: google-bert/bert-base-cased
 
4
  tags:
5
- - generated_from_trainer
6
- metrics:
7
- - accuracy
8
- model-index:
9
- - name: bert-base-cased_v3
10
- results: []
11
  ---
12
 
13
- <!-- This model card has been generated automatically according to the information the Trainer had access to. You
14
- should probably proofread and complete it, then remove this comment. -->
15
 
16
- # bert-base-cased_v3
17
 
18
- This model is a fine-tuned version of [google-bert/bert-base-cased](https://huggingface.co/google-bert/bert-base-cased) on an unknown dataset.
19
- It achieves the following results on the evaluation set:
20
- - Loss: 1.2443
21
- - Accuracy: 0.7326
22
 
23
- ## Model description
24
 
25
- More information needed
26
 
27
- ## Intended uses & limitations
 
 
 
 
28
 
29
- More information needed
30
 
31
- ## Training and evaluation data
 
32
 
33
- More information needed
34
 
35
- ## Training procedure
36
 
37
- ### Training hyperparameters
38
 
39
- The following hyperparameters were used during training:
40
- - learning_rate: 2e-05
41
- - train_batch_size: 32
42
- - eval_batch_size: 16
43
- - seed: 42
44
- - optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
45
- - lr_scheduler_type: linear
46
- - num_epochs: 11.0
47
 
48
- ### Training results
 
 
 
49
 
 
 
 
50
 
 
 
51
 
52
- ### Framework versions
 
 
 
53
 
54
- - Transformers 4.43.0.dev0
55
- - Pytorch 2.0.1+cu117
56
- - Datasets 2.20.0
57
- - Tokenizers 0.19.1
 
 
 
 
 
 
 
 
 
 
 
 
1
  ---
2
+ base_model: INSAIT-Institute/BgGPT-7B-Instruct-v0.2
3
+ library_name: peft
4
  license: apache-2.0
5
+ language:
6
+ - en
7
  tags:
8
+ - propaganda
 
 
 
 
 
9
  ---
10
 
11
+ # Model Card for identrics/BG_propaganda_detector
 
12
 
 
13
 
 
 
 
 
14
 
 
15
 
16
+ ## Model Description
17
 
18
+ - **Developed by:** Identrics
19
+ - **Language:** English
20
+ - **License:** apache-2.0
21
+ - **Finetuned from model:** google-bert/bert-base-cased
22
+ - **Context window :** 512 tokens
23
 
24
+ ## Model Description
25
 
26
+ This model consists of a fine-tuned version of google-bert/bert-base-cased for a propaganda detection task. It is effectively a binary classifier, determining wether propaganda is present in the output string.
27
+ This model was created by [`Identrics`](https://identrics.ai/), in the scope of the Wasper project.
28
 
 
29
 
30
+ ## Uses
31
 
32
+ To be used as a binary classifier to identify if propaganda is present in a string containing a comment from a social media site
33
 
34
+ ### Example
 
 
 
 
 
 
 
35
 
36
+ First install direct dependencies:
37
+ ```
38
+ pip install transformers torch accelerate
39
+ ```
40
 
41
+ Then the model can be downloaded and used for inference:
42
+ ```py
43
+ from transformers import AutoModelForSequenceClassification, AutoTokenizer
44
 
45
+ model = AutoModelForSequenceClassification.from_pretrained("identrics/EN_propaganda_detector", num_labels=2)
46
+ tokenizer = AutoTokenizer.from_pretrained("identrics/EN_propaganda_detector")
47
 
48
+ tokens = tokenizer("Our country is the most powerful country in the world!", return_tensors="pt")
49
+ output = model(**tokens)
50
+ print(output.logits)
51
+ ```
52
 
53
+
54
+ ## Training Details
55
+
56
+
57
+
58
+ Trained on a corpus of 200 human-generated comments, augmented with 200 more synthetic comments...
59
+
60
+ Achieved an f1 score of x%
61
+ <!-- This section provides information about throughput, start/end time, checkpoint size if relevant, etc. -->
62
+
63
+
64
+
65
+
66
+
67
+ - PEFT 0.11.1