File size: 9,055 Bytes
9af390c |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 214 215 216 217 218 219 220 221 222 223 224 225 226 227 228 229 230 231 232 233 234 235 236 237 238 239 240 241 242 243 244 245 246 247 248 249 250 251 252 253 254 255 256 257 258 259 260 261 262 263 264 265 266 267 268 269 270 271 272 273 274 275 276 277 278 |
---
language:
- en
metrics:
- accuracy
pipeline_tag: text-classification
tags:
- bert
- Aspects
- ABSA
- Aspects Extraction
- roberta
---
# Model Card for Model ID
<!-- Provide a quick summary of what the model is/does. -->
Extracting Implicit and Explicit Aspects from Restaurant Reviews using RoBERTa-Large Variant with Benchmark Efficiency and Custom Dataset
We present a groundbreaking approach to extracting implicit and explicit aspects from restaurant reviews in the domain. Leveraging the powerful RoBERTa-Large variant, our method achieves remarkable performance while utilizing a custom dataset.
Our research addresses the challenging task of aspect extraction, which involves identifying both explicit aspects explicitly mentioned in reviews, as well as implicit aspects that are indirectly referred to. By employing RoBERTa-Large, a state-of-the-art language model, we leverage its advanced contextual understanding to capture nuanced information from textual data.
To ensure the efficiency and accuracy of our approach, we benchmarked our system against existing methods in the field. The results were outstanding, highlighting the superiority of our approach in terms of precision, recall, and overall performance.
Furthermore, we developed a custom dataset tailored specifically to the restaurant domain, encompassing a diverse range of reviews from various platforms. This dataset allowed us to train our model with domain-specific knowledge, leading to better aspect extraction outcomes.
Overall, our research presents a novel and efficient solution for aspect extraction from restaurant reviews. By employing the RoBERTa-Large variant and a carefully curated custom dataset, we have achieved remarkable results that surpass existing approaches. This breakthrough has significant implications for sentiment analysis, opinion mining, and other natural language processing applications in the restaurant domain.
## Model Details
### Model Description
<!-- Provide a longer summary of what this model is. -->
- **Developed by:** Ali Haider
- **Shared by:** Ali Haider
- **Model type:** Bert Varinet
- **Language(s) (NLP):** English (Restaurant Domain Reviews)
- **Finetuned from model:** Roberta Large
## Uses
Aspects Extraction Model in Restaurant Domain aimns to extract the Implicit and explicit aspects that might be speifified in the Reviews we can use our model for vairous purposes such as
1. Aspects extraction from the reviews Sentences and classification under 34 aspects-categoires.
2. Aspects based Restaurant Recommendation system
3. Restaurant Reviews Analysis
### Out-of-Scope Use
Model has been tuned to classifiy the out of scope sentences into the General.
## How to Get Started with the Model
Sample Sentence: The food was very delicious, elegant Ambience and Decoration , floors were clean and most importantly the food was affoardable.
Expected Output:
Food-Taste
Food-Price
Restaurant-Decoration
Restaurant-Atmosphere
Restaurant-Hygiene
## Training Details
Roberta-large varient is used with 10678 data entires each of the sentence is classified under serveral Aspects they might belong to and trained till the Validation loss
not improving till 3 epochs.
### Training Data
Reviews are tokenized into sentences and 10678 unique sentences are annotated for training.
Aspects are Categorized under 4 categories
Restaurants (Restaurants and Ambience Merged)
Atmopshere
Building
Location
Features
Hygiene
Kitchen
Recommendation
View
Decoration
Seating Plan
Options
Experience
General
Service (Staff and Service Merged)
Behavior
Wait Time
General
Experience
Food (Food and Drinks Merged)
Cuisine
Deals
Diet Options
Ingredients
Menu
Kitchen
Portion
Presentation
Price
Quality
Taste
Flavor
Recommendation
Experience
Dishes
General
General (Out of Domain and Contextless Sentences)
General
#### Training Hyperparameters
lr=2e-5 eps=1e-8 batch_size=32
## Evaluation and Results
Classification Report
precision recall f1-score support
FOOD-CUISINE 0.69 0.83 0.76 65
FOOD-DEALS 0.81 0.75 0.78 40
FOOD-DIET_OPTION 0.73 0.93 0.82 71
FOOD-EXPERIENCE 0.38 0.44 0.40 55
FOOD-FLAVOR 0.83 0.94 0.88 63
FOOD-GENERAL 0.65 0.78 0.71 141
FOOD-INGREDIENT 0.77 0.80 0.78 54
FOOD-KITCHEN 0.50 0.60 0.55 35
FOOD-MEAL 0.72 0.74 0.73 208
FOOD-MENU 0.80 0.89 0.84 136
FOOD-PORTION 0.90 0.91 0.90 76
FOOD-PRESENTATION 0.82 0.94 0.87 33
FOOD-PRICE 0.74 0.88 0.80 57
FOOD-QUALITY 0.61 0.66 0.63 102
FOOD-RECOMMENDATION 0.65 0.47 0.55 32
FOOD-TASTE 0.79 0.84 0.82 114
GENERAL-GENERAL 0.98 0.88 0.93 163
RESTAURANT-ATMOSPHERE 0.73 0.79 0.76 170
RESTAURANT-BUILDING 0.90 0.86 0.88 44
RESTAURANT-DECORATION 0.95 0.84 0.89 44
RESTAURANT-EXPERIENCE 0.67 0.60 0.63 189
RESTAURANT-FEATURES 0.55 0.76 0.64 75
RESTAURANT-GENERAL 0.45 0.49 0.47 47
RESTAURANT-HYGIENE 0.94 0.92 0.93 51
RESTAURANT-KITCHEN 0.82 0.85 0.84 33
RESTAURANT-LOCATION 0.59 0.78 0.67 69
RESTAURANT-OPTIONS 0.42 0.41 0.41 32
RESTAURANT-RECOMMENDATION 0.62 0.71 0.67 49
RESTAURANT-SEATING_PLAN 0.78 0.82 0.80 68
RESTAURANT-VIEW 0.80 0.88 0.84 42
SERVICE-BEHAVIOUR 0.65 0.87 0.74 127
SERVICE-EXPERIENCE 0.31 0.24 0.27 21
SERVICE-GENERAL 0.74 0.81 0.77 162
SERVICE-WAIT_TIME 0.86 0.85 0.86 94
micro avg 0.72 0.78 0.75 2762
macro avg 0.71 0.76 0.73 2762
weighted avg 0.73 0.78 0.75 2762
samples avg 0.75 0.78 0.75 2762
Accuracy 0.9801993831240361
Confusin Matrix
[[[2047, 24],
[ 11, 54]],
[[2089, 7],
[ 10, 30]],
[[2041, 24],
[ 5, 66]],
[[2041, 40],
[ 31, 24]],
[[2061, 12],
[ 4, 59]],
[[1936, 59],
[ 31, 110]],
[[2069, 13],
[ 11, 43]],
[[2080, 21],
[ 14, 21]],
[[1869, 59],
[ 55, 153]],
[[1969, 31],
[ 15, 121]],
[[2052, 8],
[ 7, 69]],
[[2096, 7],
[ 2, 31]],
[[2061, 18],
[ 7, 50]],
[[1991, 43],
[ 35, 67]],
[[2096, 8],
[ 17, 15]],
[[1997, 25],
[ 18, 96]],
[[1970, 3],
[ 19, 144]],
[[1917, 49],
[ 36, 134]],
[[2088, 4],
[ 6, 38]],
[[2090, 2],
[ 7, 37]],
[[1890, 57],
[ 75, 114]],
[[2015, 46],
[ 18, 57]],
[[2061, 28],
[ 24, 23]],
[[2082, 3],
[ 4, 47]],
[[2097, 6],
[ 5, 28]],
[[2029, 38],
[ 15, 54]],
[[2086, 18],
[ 19, 13]],
[[2066, 21],
[ 14, 35]],
[[2052, 16],
[ 12, 56]],
[[2085, 9],
[ 5, 37]],
[[1950, 59],
[ 17, 110]],
[[2104, 11],
[ 16, 5]],
[[1927, 47],
[ 30, 132]],
[[2029, 13],
[ 14, 80]]]
Average Validation loss 0.06330019191129883
## Model Card Authors
Ali Haider
## Model Card Contact
+923068983139
alihaider.ah1510@gmail.com |