m3hrdadfi commited on
Commit
d3c78a9
1 Parent(s): d4579d9

Create README.md

Browse files
Files changed (1) hide show
  1. README.md +80 -0
README.md ADDED
@@ -0,0 +1,80 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ language: fa
3
+ license: apache-2.0
4
+ ---
5
+
6
+ # ParsBERT (v2.0)
7
+ A Transformer-based Model for Persian Language Understanding
8
+
9
+
10
+ We reconstructed the vocabulary and fine-tuned the ParsBERT v1.1 on the new Persian corpora in order to provide some functionalities for using ParsBERT in other scopes!
11
+ Please follow the [ParsBERT](https://github.com/hooshvare/parsbert) repo for the latest information about previous and current models.
12
+
13
+
14
+ ## Persian Text Classification [DigiMag, Persian News]
15
+
16
+ The task target is labeling texts in a supervised manner in both existing datasets `DigiMag` and `Persian News`.
17
+
18
+
19
+
20
+ ### Persian News
21
+
22
+ A dataset of various news articles scraped from different online news agencies' websites. The total number of articles is 16,438, spread over eight different classes.
23
+
24
+ 1. Economic
25
+ 2. International
26
+ 3. Political
27
+ 4. Science Technology
28
+ 5. Cultural Art
29
+ 6. Sport
30
+ 7. Medical
31
+
32
+
33
+ | Label | # |
34
+ |:------------------:|:----:|
35
+ | Social | 2170 |
36
+ | Economic | 1564 |
37
+ | International | 1975 |
38
+ | Political | 2269 |
39
+ | Science Technology | 2436 |
40
+ | Cultural Art | 2558 |
41
+ | Sport | 1381 |
42
+ | Medical | 2085 |
43
+
44
+
45
+ **Download**
46
+ You can download the dataset from [here](https://drive.google.com/uc?id=1B6xotfXCcW9xS1mYSBQos7OCg0ratzKC)
47
+
48
+
49
+ ## Results
50
+
51
+ The following table summarizes the F1 score obtained by ParsBERT as compared to other models and architectures.
52
+
53
+ | Dataset | ParsBERT v2 | ParsBERT v1 | mBERT |
54
+ |:-----------------:|:-----------:|:-----------:|:-----:|
55
+ | Persian News | 97.44* | 97.19 | 95.79 |
56
+
57
+
58
+ ## How to use :hugs:
59
+
60
+ | Task | Notebook |
61
+ |---------------------|---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|
62
+ | Text Classification | [![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/hooshvare/parsbert/blob/master/notebooks/Taaghche_Sentiment_Analysis.ipynb) |
63
+
64
+
65
+ ### BibTeX entry and citation info
66
+
67
+ Please cite in publications as the following:
68
+
69
+ ```bibtex
70
+ @article{ParsBERT,
71
+ title={ParsBERT: Transformer-based Model for Persian Language Understanding},
72
+ author={Mehrdad Farahani, Mohammad Gharachorloo, Marzieh Farahani, Mohammad Manthouri},
73
+ journal={ArXiv},
74
+ year={2020},
75
+ volume={abs/2005.12515}
76
+ }
77
+ ```
78
+
79
+ ## Questions?
80
+ Post a Github issue on the [ParsBERT Issues](https://github.com/hooshvare/parsbert/issues) repo.