KoalaAI
/

Emoji-Suggester

@@ -2,20 +2,40 @@
 tags:
 - autotrain
 - text-classification
 language:
-- unk
 widget:
-- text: "I love AutoTrain"
-datasets:
-- DarwinAnim8or/autotrain-data-emoji-v3
 co2_eq_emissions:
   emissions: 0.6833689692559574
 ---
-# Model Trained Using AutoTrain
 - Problem type: Multi-class Classification
-- Model ID: 83947142400
 - CO2 Emissions (in grams): 0.6834
 ## Validation Metrics
@@ -38,7 +58,7 @@ co2_eq_emissions:
 You can use cURL to access this model:
 ```
-$ curl -X POST -H "Authorization: Bearer YOUR_API_KEY" -H "Content-Type: application/json" -d '{"inputs": "I love AutoTrain"}' https://api-inference.huggingface.co/models/DarwinAnim8or/autotrain-emoji-v3-83947142400
 ```
 Or Python API:
@@ -46,9 +66,9 @@ Or Python API:
 ```
 from transformers import AutoModelForSequenceClassification, AutoTokenizer
-model = AutoModelForSequenceClassification.from_pretrained("DarwinAnim8or/autotrain-emoji-v3-83947142400", use_auth_token=True)
-tokenizer = AutoTokenizer.from_pretrained("DarwinAnim8or/autotrain-emoji-v3-83947142400", use_auth_token=True)
 inputs = tokenizer("I love AutoTrain", return_tensors="pt")

 tags:
 - autotrain
 - text-classification
+- emoji
+- sentiment
 language:
+- en
 widget:
+- text: I love apples
+- text: I hate apples
+- text: I hate it when they don't listen
+- text: I hate it when they don't listen :(
+- text: It's so cosy
+- text: there's nothing like nature
 co2_eq_emissions:
   emissions: 0.6833689692559574
+license: openrail
+datasets:
+- adorkin/extended_tweet_emojis
 ---
+# Emoji Suggester
+This model is a text generation model that can suggest emojis based on a given text. It uses the deberta-v3-base model as a backbone.
+## Training Data
+The dataset this was trained on has had it's emoji's replaced with the unicode characters rather than an index, which required a seperate file to map the indices to.
+The dataset was further modified in the following ways:
+* The "US" emoji was removed, as it serves very little purpose in general conversation.
+* The dataset was deduped
+* The amount of times each emoji appears in the dataset is more or less even to all the others; preventing the model from becoming heavily biased on the emojis that appear more often in training data.
+## Intended uses & limitations
+This model is intended to be used for fun and entertainment purposes, such as adding emojis to social media posts, messages, or emails. It is not intended to be used for any serious or sensitive applications, such as sentiment analysis, emotion recognition, or hate speech detection. The model may not be able to handle texts that are too long, complex, or ambiguous, and may generate inappropriate or irrelevant emojis in some cases. The model may also reflect the biases and stereotypes present in the training data, such as gender, race, or culture. Users are advised to use the model with caution and discretion.
+## Model Training Info
 - Problem type: Multi-class Classification
 - CO2 Emissions (in grams): 0.6834
 ## Validation Metrics
 You can use cURL to access this model:
 ```
+$ curl -X POST -H "Authorization: Bearer YOUR_API_KEY" -H "Content-Type: application/json" -d '{"inputs": "I love apples"}' https://api-inference.huggingface.co/models/KoalaAI/Emoji-Suggester
 ```
 Or Python API:
 ```
 from transformers import AutoModelForSequenceClassification, AutoTokenizer
+model = AutoModelForSequenceClassification.from_pretrained("KoalaAI/Emoji-Suggester", use_auth_token=True)
+tokenizer = AutoTokenizer.from_pretrained("KoalaAI/Emoji-Suggester", use_auth_token=True)
 inputs = tokenizer("I love AutoTrain", return_tensors="pt")