metadata
license: mit
language:
- en
pipeline_tag: text-classification
tags:
- sentiment-analysis
- text-classification
- generic
- sentiment-classification
Usage:
Model
Base version of e5-v2 finetunned on an annotated subset of C4 (Numind/C4_sentiment-analysis). This model provide generic embedding for sentiment analysis. Embeddings cab be used out of the box or fine tune on specific datasets.
Usage
Below is an example to encode text and get embedding.
import torch.nn.functional as F
from torch import Tensor
from transformers import AutoTokenizer, AutoModel
model = AutoModel.from_pretrained("Numind/e5-base-SA")
tokenizer = AutoTokenizer.from_pretrained("Numind/e5-base-SA")
device = torch.device('cuda') if torch.cuda.is_available() else torch.device('cpu')
model.to(device)
size = 256
text = "This movie is amazing"
encoding = tokenizer(
text,
truncation=True,
padding='max_length',
max_length= size,
)
emb = model(
torch.reshape(torch.tensor(encoding.input_ids),(1,len(encoding.input_ids))).to(device),output_hidden_states=True
).hidden_states[-1].cpu().detach()
embText = torch.mean(emb,axis = 1)