TymaaHammouda commited on
Commit
e1af151
1 Parent(s): ee432a1

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +56 -3
README.md CHANGED
@@ -1,3 +1,56 @@
1
- ---
2
- license: mit
3
- ---
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ## ArBanking77: Intent Detection Neural Model and a New Dataset in Modern and Dialectical Arabic
2
+
3
+
4
+ Online Demo
5
+ --------
6
+ You can try our model using the demo link below
7
+
8
+ [https://sina.birzeit.edu/arbanking77/](https://sina.birzeit.edu/arbanking77/)
9
+
10
+
11
+ ArBanking77 Corpus
12
+ --------
13
+ ArBanking77 consists of 31,404 (MSA and Palestinian dialect) that are manually Arabized and localized from the original English Banking77 dataset; which consists of 13,083 queries. Each query is classified into one of the 77 classes (intents) including card arrival, card linking, exchange rate, and automatic top-up. A neural model based on AraBERT was fine-tuned on the ArBanking77 dataset (F1-score 92% for MSA, 90% for PAL)
14
+
15
+
16
+ Corpus Download
17
+ --------
18
+ A sample data is available in the `data` directory. But the entire ArBanking77 corpus is
19
+ available to download upon request for academic and commercial use. Request to download
20
+ ArBanking77 (corpus and the model).
21
+
22
+ [https://sina.birzeit.edu/arbanking77/](https://sina.birzeit.edu/arbanking77/)
23
+
24
+ Model Download
25
+ --------
26
+ huggingface: [https://huggingface.co/SinaLab/ArBanking77](https://huggingface.co/SinaLab/ArBanking77)
27
+
28
+
29
+ Model Training
30
+ --------
31
+
32
+ ```commandline
33
+ python run_glue_no_trainer.py
34
+ --model_name_or_path aubmindlab/bert-base-arabertv2
35
+ --train_file ./data/Banking77_Arabized_Ver3_train_MSA_PAL_merged.json
36
+ --validation_file ./data/Banking77_Arabized_Ver3_val_MSA_PAL_merged.json
37
+ --seed 42
38
+ --max_length 128
39
+ --learning_rate 4e-5
40
+ --num_train_epochs 20
41
+ --per_device_train_batch_size 32
42
+ --output_dir ./results
43
+ ```
44
+
45
+ File
46
+ source: [run_glue_no_trainer.py](https://github.com/huggingface/transformers/blob/e9ad51306fdcc3fb79d837d667e21c6d075a2451/examples/pytorch/text-classification/run_glue_no_trainer.py)
47
+
48
+
49
+ Credits
50
+ -------
51
+ This research is partially funded by the Palestinian Higher Council for Innovation and Excellence.
52
+
53
+
54
+ Citation
55
+ -------
56
+ Mustafa Jarrar, Ahmet Birim, Mohammed Khalilia, Mustafa Erden, and Sana Ghanem: [ArBanking77: Intent Detection Neural Model and a New Dataset in Modern and Dialectical Arabic](http://www.jarrar.info/publications/JBKEG23.pdf). In Proceedings of the 1st Arabic Natural Language Processing Conference (ArabicNLP), Part of the EMNLP 2023. ACL.