Edit model card

Model

Base version of e5-multilingual finetunned on an annotated subset of mC4 (multilingual C4). This model provide generic embedding for sentiment analysis. Embeddings can be used out of the box or fine tune on specific datasets.

Blog post: https://www.numind.ai/blog/creating-task-specific-foundation-models-with-gpt-4

Usage

Below is an example to encode text and get embedding.

import torch
from transformers import AutoTokenizer, AutoModel


model = AutoModel.from_pretrained("Numind/e5-multilingual-sentiment_analysis")
tokenizer = AutoTokenizer.from_pretrained("Numind/e5-multilingual-sentiment_analysis")
device = torch.device('cuda') if torch.cuda.is_available() else torch.device('cpu')
model.to(device)

size = 256
text = "This movie is amazing"

encoding = tokenizer(
    text,
    truncation=True, 
    padding='max_length', 
    max_length= size,
)

emb = model(
      torch.reshape(torch.tensor(encoding.input_ids),(1,len(encoding.input_ids))).to(device),output_hidden_states=True
).hidden_states[-1].cpu().detach()

embText = torch.mean(emb,axis = 1)
Downloads last month
306
Safetensors
Model size
278M params
Tensor type
F32
·
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social visibility and check back later, or deploy to Inference Endpoints (dedicated) instead.

Collection including numind/NuSentiment-multilingual