m3hrdadfi commited on
Commit
86b64c4
1 Parent(s): 3856494

Create README.md

Browse files
Files changed (1) hide show
  1. README.md +80 -0
README.md ADDED
@@ -0,0 +1,80 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ language: fa
3
+ license: apache-2.0
4
+ ---
5
+
6
+ # ParsBERT (v2.0)
7
+ A Transformer-based Model for Persian Language Understanding
8
+
9
+
10
+ We reconstructed the vocabulary and fine-tuned the ParsBERT v1.1 on the new Persian corpora in order to provide some functionalities for using ParsBERT in other scopes!
11
+ Please follow the [ParsBERT](https://github.com/hooshvare/parsbert) repo for the latest information about previous and current models.
12
+
13
+
14
+ ## Persian Text Classification [DigiMag, Persian News]
15
+
16
+ The task target is labeling texts in a supervised manner in both existing datasets `DigiMag` and `Persian News`.
17
+
18
+
19
+
20
+ ### DigiMag
21
+
22
+ A total of 8,515 articles scraped from [Digikala Online Magazine](https://www.digikala.com/mag/). This dataset includes seven different classes.
23
+
24
+ 1. Video Games
25
+ 2. Shopping Guide
26
+ 3. Health Beauty
27
+ 4. Science Technology
28
+ 5. General
29
+ 6. Art Cinema
30
+ 7. Books Literature
31
+
32
+
33
+ | Label | # |
34
+ |:------------------:|:----:|
35
+ | Video Games | 1967 |
36
+ | Shopping Guide | 125 |
37
+ | Health Beauty | 1610 |
38
+ | Science Technology | 2772 |
39
+ | General | 120 |
40
+ | Art Cinema | 1667 |
41
+ | Books Literature | 254 |
42
+
43
+
44
+ **Download**
45
+ You can download the dataset from [here](https://drive.google.com/uc?id=1YgrCYY-Z0h2z0-PfWVfOGt1Tv0JDI-qz)
46
+
47
+
48
+ ## Results
49
+
50
+ The following table summarizes the F1 score obtained by ParsBERT as compared to other models and architectures.
51
+
52
+ | Dataset | ParsBERT v2 | ParsBERT v1 | mBERT |
53
+ |:-----------------:|:-----------:|:-----------:|:-----:|
54
+ | Digikala Magazine | 93.65* | 93.59 | 90.72 |
55
+
56
+
57
+
58
+ ## How to use :hugs:
59
+
60
+ | Task | Notebook |
61
+ |---------------------|---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|
62
+ | Text Classification | [![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/hooshvare/parsbert/blob/master/notebooks/Taaghche_Sentiment_Analysis.ipynb) |
63
+
64
+
65
+ ### BibTeX entry and citation info
66
+
67
+ Please cite in publications as the following:
68
+
69
+ ```bibtex
70
+ @article{ParsBERT,
71
+ title={ParsBERT: Transformer-based Model for Persian Language Understanding},
72
+ author={Mehrdad Farahani, Mohammad Gharachorloo, Marzieh Farahani, Mohammad Manthouri},
73
+ journal={ArXiv},
74
+ year={2020},
75
+ volume={abs/2005.12515}
76
+ }
77
+ ```
78
+
79
+ ## Questions?
80
+ Post a Github issue on the [ParsBERT Issues](https://github.com/hooshvare/parsbert/issues) repo.