abullard1 commited on
Commit
6bfe21c
·
verified ·
1 Parent(s): 2669868

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +37 -7
README.md CHANGED
@@ -18,6 +18,9 @@ tags:
18
  - sentiment-analysis
19
  - text-classification
20
  - fine-tuned
 
 
 
21
  thumbnail: https://i.ibb.co/Bnj0gw6/abullard1-steam-review-constructiveness-classifier-logo.png
22
 
23
  model-index:
@@ -42,14 +45,34 @@ model-index:
42
  type: f1
43
  value: 0.794
44
  ---
 
 
 
 
 
45
 
46
- # Fine-tuned ALBERT Model for Constructiveness Detection in Steam Reviews
 
 
 
 
 
 
 
47
 
48
  ## Model Summary
49
 
50
- This model is a fine-tuned version of **albert-base-v2**, designed to classify whether Steam game reviews are constructive or non-constructive. The model was trained on the [1.5K Steam Reviews Binary Labeled for Constructiveness dataset](https://huggingface.co/datasets/abullard1/steam-reviews-constructiveness-binary-label-annotations-1.5k), which consists of user-generated game reviews (along other features) labeled with binary labels (`1 for constructive` or `0 for non-constructive`).
 
 
51
 
52
- The datasets featues were concatenated into Strings with the following format: "Review: **{review}**, Playtime: **{author_playtime_at_review}**, Voted Up: **{voted_up}**, Upvotes: **{votes_up}**, Votes Funny: **{votes_funny}**" and then fed to the model accompanied by the respective ***constructive*** labels. This approach of concatenating the features into a simple String offers a good trade-off between complexity and performance, compared to other options.
 
 
 
 
 
 
53
 
54
  ### Intended Use
55
 
@@ -57,7 +80,8 @@ The model can be applied in any scenario where it's important to distinguish bet
57
 
58
  ### Limitations
59
 
60
- The model may be less effective in domains outside of gaming, as it was trained specifically on Steam reviews. Additionally, a slightly **imbalanced dataset** was used for training (approximately 63% non-constructive, 37% constructive).
 
61
 
62
  ## Evaluation Results
63
 
@@ -70,13 +94,15 @@ The model was trained and evaluated using an 80/10/10 Train/Dev/Test split, achi
70
 
71
  These results indicate that the model performs reasonably well at identifying the correct label. (~80%)
72
 
 
 
73
  ## How to Use
74
 
75
  ### Via the Huggingface Space
76
- The easiest way to test and try out the model is via its' [Huggingface Space](https://huggingface.co/spaces/abullard1/steam-review-constructiveness-classifier).
77
 
78
  ### Via the HF Transformers Library
79
- You can also use this model through the Hugging Face transformers `pipeline` API for easy classification. Here's how to do it in Python:
80
 
81
  ```python
82
  from transformers import pipeline
@@ -97,4 +123,8 @@ classifier = pipeline(
97
  top_k=None,
98
  truncation=True,
99
  max_length=512,
100
- torch_dtype=torch_d_type)
 
 
 
 
 
18
  - sentiment-analysis
19
  - text-classification
20
  - fine-tuned
21
+ developers:
22
+ - Samuel Ruairí Bullard
23
+ - Marco Schreiner
24
  thumbnail: https://i.ibb.co/Bnj0gw6/abullard1-steam-review-constructiveness-classifier-logo.png
25
 
26
  model-index:
 
45
  type: f1
46
  value: 0.794
47
  ---
48
+ <br>
49
+ <br>
50
+ <div style="text-align: center;">
51
+ <img src="https://i.ibb.co/Bnj0gw6/abullard1-steam-review-constructiveness-classifier-logo.png" style="max-width: 30%; display: block; margin: 0 auto;">
52
+ </div>
53
 
54
+ <br>
55
+ <br>
56
+ <br>
57
+
58
+ <div style="text-align: center;">
59
+ <h1>Fine-tuned ALBERT Model for Constructiveness Detection in Steam Reviews</h1>
60
+ </div>
61
+ <hr>
62
 
63
  ## Model Summary
64
 
65
+ This model is a fine-tuned version of **albert-base-v2**, designed to classify whether Steam game reviews are constructive or non-constructive. It leverages the [1.5K Steam Reviews Binary Labeled for Constructiveness dataset](https://huggingface.co/datasets/abullard1/steam-reviews-constructiveness-binary-label-annotations-1.5k), containing user-generated game reviews labeled as either:
66
+ - **1 (constructive)**
67
+ - **0 (non-constructive)**
68
 
69
+ The dataset features were combined into a single string per review, formatted as follows:
70
+ <br>
71
+ <br>
72
+ "Review: **{review}**, Playtime: **{author_playtime_at_review}**, Voted Up: **{voted_up}**, Upvotes: **{votes_up}**, Votes Funny: **{votes_funny}**" and then fed to the model accompanied by the respective ***constructive*** labels.
73
+ <br>
74
+ <br>
75
+ This approach of concatenating the features into a simple String offers a good trade-off between complexity and performance, compared to other options.
76
 
77
  ### Intended Use
78
 
 
80
 
81
  ### Limitations
82
 
83
+ - **Domain Specificity**: The model was trained on Steam reviews and may not generalize well outside gaming.
84
+ - **Dataset Imbalance**: The training data has an approximate 63%-37% split between non-constructive and constructive reviews.
85
 
86
  ## Evaluation Results
87
 
 
94
 
95
  These results indicate that the model performs reasonably well at identifying the correct label. (~80%)
96
 
97
+ <hr>
98
+
99
  ## How to Use
100
 
101
  ### Via the Huggingface Space
102
+ Explore and test the model interactively on its [Hugging Face Space](https://huggingface.co/spaces/abullard1/steam-review-constructiveness-classifier).
103
 
104
  ### Via the HF Transformers Library
105
+ To use the model programmatically, use this Python snippet:
106
 
107
  ```python
108
  from transformers import pipeline
 
123
  top_k=None,
124
  truncation=True,
125
  max_length=512,
126
+ torch_dtype=torch_d_type)
127
+
128
+ review = "Review: I think this is a great game but it still has some room for improvement., Playtime: 12, Voted Up: True, Upvotes: 1, Votes Funny: 0"
129
+ result = classifier(review)
130
+ print(result)