--- language: - ar datasets: - HARD tags: - HARD widget: - text: "جيد. المكان جميل وهاديء. كل شي جيد ونظيف" - text: "استغرب تقييم الفندق كخمس نجوم”. لا شي. يستحق" --- # BERT-ASTD Balanced Arabic version bert model fine tuned on Hotel Arabic Reviews dataset from booking.com (HARD) dataset balanced version to identify sentiments opinion in Arabic language. ## Data The model were fine-tuned on ~93000 book reviews in arabic using bert large arabic Dataset: - Train 70% - Validation: 10% - Test: 20% ## Results | class | precision | recall | f1-score | Support | |----------|-----------|--------|----------|---------| | 0 | 0.9733 | 0.9547 | 0.9639 | 10570 | | 1 | 0.9555 | 0.9738 | 0.9646 | 10570 | | Accuracy | | | 0.9642 | 21140 | ## How to use You can use these models by installing `torch` or `tensorflow` and Huggingface library `transformers`. And you can use it directly by initializing it like this: ```python from transformers import AutoModelForSequenceClassification, AutoTokenizer model_name="mofawzy/Bert-hard-balanced" model = AutoModelForSequenceClassification.from_pretrained(model_name,num_labels=2) tokenizer = AutoTokenizer.from_pretrained(model_name) ```