Update README.md (#1)

Browse files

- Update README.md (5e2674fae44fcb716483b2f00456a3b567285021)

Files changed (1) hide show

README.md +101 -1

README.md CHANGED Viewed

@@ -10,6 +10,106 @@ language:
 - en
 metrics:
 - accuracy
 library_name: transformers
 pipeline_tag: text-classification
----

 - en
 metrics:
 - accuracy
+- sparse_val accuracy
+- sparse_val categorical accuracy
 library_name: transformers
 pipeline_tag: text-classification
+tags:
+- textclassisification
+- roberta
+- robertabase
+- sentimentanalysis
+- nlp
+- tweetanalysis
+- tweet
+- analysis
+- sentiment
+- positive
+- newsanalysis
+---
+---
+  <b>BYRD'S I - ROBERTA BASED TWEET/REVIEW/TEXT ANALYSIS</b>
+---
+This is ro<b>BERT</b>a-base model fine tuned on 8 datasets with ~20 M tweets this model is suitable for english while can do a fine job on other languages.
+<b>Git Repo:</b><a href = "https://github.com/Caffeine-Coders/Sentiment-Analysis-Project"> SENTIMENTANALYSIS-PROJECT</a>
+<b>Demo:</b><a href = "https://byrdi.netlify.app/"> BYRD'S I</a>
+<b>labels: </b>
+ 0 -> Negative;
+ 1 -> Neutral;
+ 2 -> Positive;
+<b>Model Metrics</b><br/>
+<b>Accuracy: </b> ~96% <br/>
+<b>Sparse Categorical Accuracy: </b> 0.9597 <br/>
+<b>Loss: </b> 0.1144 <br/>
+<b>val_loss -- [onLast_train] : </b> 0.1482 <br/>
+<b>Note: </b>
+Due to dataset discrepencies of Neutral data we published another model <a href = "https://huggingface.co/AK776161/birdseye_roberta-base-18">
+Byrd's I only positive_negative model</a> to find only neutral data and have used
+<b>AdaBoot</b> method to get the accurate output.
+# Example of Classification:
+```python
+from transformers import AutoTokenizer, AutoModelForSequenceClassification, AutoModelForSeq2SeqLM
+from transformers import TFAutoModelForSequenceClassification
+import pandas as pd
+import numpy as np
+import tensorflow
+# model 0
+tokenizer = AutoTokenizer.from_pretrained("AK776161/birdseye_roberta-base-18", use_fast = True)
+model = AutoModelForSequenceClassification.from_pretrained("AK776161/birdseye_roberta-base-18", from_tf=True)
+# model1
+tokenizer1 = AutoTokenizer.from_pretrained("AK776161/birdseye_roberta-base-tweet-eval", use_fast = True)
+model1 = AutoModelForSequenceClassification.from_pretrained("AK776161/birdseye_roberta-base-tweet-eval",from_tf=True)
+#-----------------------Adaboot technique---------------------------
+def nparraymeancalc(arr1, arr2):
+  returner = []
+  for i in range(0,len(arr1)):
+    if(arr1[i][1] < -7):
+      arr1[i][1] = 0
+    returner.append(np.mean([arr1[i],arr2[i]], axis = 0))
+  return np.array(returner)
+def predictions(tokenizedtext):
+  output1 = model(**tokenizedtext)
+  output2 = model1(**tokenizedtext)
+  logits1 = output1.logits
+  logits1 = logits1.detach().numpy()
+  logits2 = output2.logits
+  logits2 = logits2.detach().numpy()
+  # print(logits1, logits2)
+  predictionresult = nparraymeancalc(logits1,logits2)
+  return np.array(predictionresult)
+def labelassign(predictionresult):
+  labels = []
+  for i in predictionresult:
+    label_id = i.argmax()
+    labels.append(label_id)
+  return labels
+tokenizeddata = tokenizer("----YOUR_TEXT---", return_tensors = 'pt', padding = True, truncation = True)
+result = predictions(tokenizeddata)
+print(labelassign(result))
+```
+Output for "I LOVE YOU":
+```
+1) Positive: 0.994
+2) Negative: 0.000
+3) Neutral: 0.006
+```