julien-c HF staff commited on
Commit
524b8b1
1 Parent(s): 8fb05db

Migrate model card from transformers-repo

Browse files

Read announcement at https://discuss.huggingface.co/t/announcement-all-model-cards-will-be-migrated-to-hf-co-model-repos/2755
Original file history: https://github.com/huggingface/transformers/commits/master/model_cards/m3hrdadfi/albert-fa-base-v2/README.md

Files changed (1) hide show
  1. README.md +161 -0
README.md ADDED
@@ -0,0 +1,161 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ language: fa
3
+ tags:
4
+ - albert-persian
5
+ - persian-lm
6
+ license: apache-2.0
7
+ datasets:
8
+ - Persian Wikidumps
9
+ - MirasText
10
+ - BigBang Page
11
+ - Chetor
12
+ - Eligasht
13
+ - DigiMag
14
+ - Ted Talks
15
+ - Books (Novels, ...)
16
+ ---
17
+
18
+ # ALBERT-Persian
19
+
20
+ ## ALBERT-Persian: A Lite BERT for Self-supervised Learning of Language Representations for the Persian Language
21
+
22
+ ## Introduction
23
+
24
+ ALBERT-Persian trained on a massive amount of public corpora ([Persian Wikidumps](https://dumps.wikimedia.org/fawiki/), [MirasText](https://github.com/miras-tech/MirasText)) and six other manually crawled text data from a various type of websites ([BigBang Page](https://bigbangpage.com/) `scientific`, [Chetor](https://www.chetor.com/) `lifestyle`, [Eligasht](https://www.eligasht.com/Blog/) `itinerary`, [Digikala](https://www.digikala.com/mag/) `digital magazine`, [Ted Talks](https://www.ted.com/talks) `general conversational`, Books `novels, storybooks, short stories from old to the contemporary era`).
25
+
26
+
27
+
28
+ ## Intended uses & limitations
29
+
30
+ You can use the raw model for either masked language modeling or next sentence prediction, but it's mostly intended to
31
+ be fine-tuned on a downstream task. See the [model hub](https://huggingface.co/models?search=albert-fa) to look for
32
+ fine-tuned versions on a task that interests you.
33
+
34
+
35
+ ### How to use
36
+
37
+ #### TensorFlow 2.0
38
+
39
+ ```python
40
+ from transformers import AutoConfig, AutoTokenizer, TFAutoModel
41
+
42
+ config = AutoConfig.from_pretrained("m3hrdadfi/albert-fa-base-v2")
43
+ tokenizer = AutoTokenizer.from_pretrained("m3hrdadfi/albert-fa-base-v2")
44
+ model = TFAutoModel.from_pretrained("m3hrdadfi/albert-fa-base-v2")
45
+
46
+ text = "ما در هوشواره معتقدیم با انتقال صحیح دانش و آگاهی، همه افراد می‌توانند از ابزارهای هوشمند استفاده کنند. شعار ما هوش مصنوعی برای همه است."
47
+ tokenizer.tokenize(text)
48
+
49
+ >>> ['▁ما', '▁در', '▁هوش', 'واره', '▁معتقد', 'یم', '▁با', '▁انتقال', '▁صحیح', '▁دانش', '▁و', '▁اگاه', 'ی', '،', '▁همه', '▁افراد', '▁می', '▁توانند', '▁از', '▁ابزارهای', '▁هوشمند', '▁استفاده', '▁کنند', '.', '▁شعار', '▁ما', '▁هوش', '▁مصنوعی', '▁برای', '▁همه', '▁است', '.']
50
+
51
+ ```
52
+
53
+ #### Pytorch
54
+
55
+ ```python
56
+ from transformers import AutoConfig, AutoTokenizer, AutoModel
57
+
58
+ config = AutoConfig.from_pretrained("m3hrdadfi/albert-fa-base-v2")
59
+ tokenizer = AutoTokenizer.from_pretrained("m3hrdadfi/albert-fa-base-v2")
60
+ model = AutoModel.from_pretrained("m3hrdadfi/albert-fa-base-v2")
61
+ ```
62
+
63
+ ## Training
64
+
65
+ ALBERT-Persian is the first attempt on ALBERT for the Persian Language. The model was trained based on Google's ALBERT BASE Version 2.0 over various writing styles from numerous subjects (e.g., scientific, novels, news) with more than `3.9M` documents, `73M` sentences, and `1.3B` words, like the way we did for [ParsBERT](https://github.com/hooshvare/parsbert).
66
+
67
+ ## Goals
68
+ Objective goals during training are as below (after 140K steps).
69
+
70
+ ``` bash
71
+ ***** Eval results *****
72
+ global_step = 140000
73
+ loss = 2.0080082
74
+ masked_lm_accuracy = 0.6141017
75
+ masked_lm_loss = 1.9963315
76
+ sentence_order_accuracy = 0.985
77
+ sentence_order_loss = 0.06908702
78
+ ```
79
+
80
+
81
+ ## Derivative models
82
+
83
+ ### Base Config
84
+
85
+ #### Albert Model
86
+ - [m3hrdadfi/albert-face-base-v2](https://huggingface.co/m3hrdadfi/albert-fa-base-v2)
87
+
88
+ #### Albert Sentiment Analysis
89
+ - [m3hrdadfi/albert-fa-base-v2-sentiment-digikala](https://huggingface.co/m3hrdadfi/albert-fa-base-v2-sentiment-digikala)
90
+ - [m3hrdadfi/albert-fa-base-v2-sentiment-snappfood](https://huggingface.co/m3hrdadfi/albert-fa-base-v2-sentiment-snappfood)
91
+ - [m3hrdadfi/albert-fa-base-v2-sentiment-deepsentipers-binary](https://huggingface.co/m3hrdadfi/albert-fa-base-v2-sentiment-deepsentipers-binary)
92
+ - [m3hrdadfi/albert-fa-base-v2-sentiment-deepsentipers-multi](https://huggingface.co/m3hrdadfi/albert-fa-base-v2-sentiment-deepsentipers-multi)
93
+ - [m3hrdadfi/albert-fa-base-v2-sentiment-binary](https://huggingface.co/m3hrdadfi/albert-fa-base-v2-sentiment-binary)
94
+ - [m3hrdadfi/albert-fa-base-v2-sentiment-multi](https://huggingface.co/m3hrdadfi/albert-fa-base-v2-sentiment-multi)
95
+ - [m3hrdadfi/albert-fa-base-v2-sentiment-multi](https://huggingface.co/m3hrdadfi/albert-fa-base-v2-sentiment-multi)
96
+
97
+ #### Albert Text Classification
98
+ - [m3hrdadfi/albert-fa-base-v2-clf-digimag](https://huggingface.co/m3hrdadfi/albert-fa-base-v2-clf-digimag)
99
+ - [m3hrdadfi/albert-fa-base-v2-clf-persiannews](https://huggingface.co/m3hrdadfi/albert-fa-base-v2-clf-persiannews)
100
+
101
+ #### Albert NER
102
+ - [m3hrdadfi/albert-fa-base-v2-ner](https://huggingface.co/m3hrdadfi/albert-fa-base-v2-ner)
103
+ - [m3hrdadfi/albert-fa-base-v2-ner-arman](https://huggingface.co/m3hrdadfi/albert-fa-base-v2-ner-arman)
104
+ - [m3hrdadfi/albert-fa-base-v2-ner-arman](https://huggingface.co/m3hrdadfi/albert-fa-base-v2-ner-arman)
105
+
106
+ ## Eval results
107
+
108
+ The following tables summarize the F1 scores obtained by ALBERT-Persian as compared to other models and architectures.
109
+
110
+
111
+ ### Sentiment Analysis (SA) Task
112
+
113
+ | Dataset | ALBERT-fa-base-v2 | ParsBERT-v1 | mBERT | DeepSentiPers |
114
+ |:------------------------:|:-----------------:|:-----------:|:-----:|:-------------:|
115
+ | Digikala User Comments | 81.12 | 81.74 | 80.74 | - |
116
+ | SnappFood User Comments | 85.79 | 88.12 | 87.87 | - |
117
+ | SentiPers (Multi Class) | 66.12 | 71.11 | - | 69.33 |
118
+ | SentiPers (Binary Class) | 91.09 | 92.13 | - | 91.98 |
119
+
120
+
121
+ ### Text Classification (TC) Task
122
+
123
+ | Dataset | ALBERT-fa-base-v2 | ParsBERT-v1 | mBERT |
124
+ |:-----------------:|:-----------------:|:-----------:|:-----:|
125
+ | Digikala Magazine | 92.33 | 93.59 | 90.72 |
126
+ | Persian News | 97.01 | 97.19 | 95.79 |
127
+
128
+
129
+ ### Named Entity Recognition (NER) Task
130
+
131
+ | Dataset | ALBERT-fa-base-v2 | ParsBERT-v1 | mBERT | MorphoBERT | Beheshti-NER | LSTM-CRF | Rule-Based CRF | BiLSTM-CRF |
132
+ |:-------:|:-----------------:|:-----------:|:-----:|:----------:|:------------:|:--------:|:--------------:|:----------:|
133
+ | PEYMA | 88.99 | 93.10 | 86.64 | - | 90.59 | - | 84.00 | - |
134
+ | ARMAN | 97.43 | 98.79 | 95.89 | 89.9 | 84.03 | 86.55 | - | 77.45 |
135
+
136
+
137
+ ### BibTeX entry and citation info
138
+
139
+ Please cite in publications as the following:
140
+
141
+ ```bibtex
142
+ @misc{ALBERT-Persian,
143
+ author = {Mehrdad Farahani},
144
+ title = {ALBERT-Persian: A Lite BERT for Self-supervised Learning of Language Representations for the Persian Language},
145
+ year = {2020},
146
+ publisher = {GitHub},
147
+ journal = {GitHub repository},
148
+ howpublished = {\url{https://github.com/m3hrdadfi/albert-persian}},
149
+ }
150
+
151
+ @article{ParsBERT,
152
+ title={ParsBERT: Transformer-based Model for Persian Language Understanding},
153
+ author={Mehrdad Farahani, Mohammad Gharachorloo, Marzieh Farahani, Mohammad Manthouri},
154
+ journal={ArXiv},
155
+ year={2020},
156
+ volume={abs/2005.12515}
157
+ }
158
+ ```
159
+
160
+ ## Questions?
161
+ Post a Github issue on the [ALBERT-Persian](https://github.com/m3hrdadfi/albert-persian) repo.