A manually pruned version of distilbert-portuguese-cased, finetuned to produce high quality embeddings in a lightweight form factor.

Model Trained Using AutoTrain

  • Problem type: Sentence Transformers

Validation Metrics

loss: 0.3181200921535492

cosine_accuracy: 0.8921948650328134

Usage

Direct Usage (Sentence Transformers)

First install the Sentence Transformers library:

pip install -U sentence-transformers

Then you can load this model and run inference.

from sentence_transformers import SentenceTransformer

# Download from the Hugging Face Hub
model = SentenceTransformer("cnmoro/micro-bertim-embeddings")
# Run inference
sentences = [
    'O pôr do sol pinta o céu com tons de laranja e vermelho',
    'Joana adora estudar matemática nas tardes de sábado',
    'Os pássaros voam em formação, criando um espetáculo no horizonte',
]
embeddings = model.encode(sentences)
print(embeddings.shape)

# Get the similarity scores for the embeddings
similarities = model.similarity(embeddings, embeddings)
print(similarities.shape)
Downloads last month
6
Safetensors
Model size
4.43M params
Tensor type
F32
·
Inference Examples
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social visibility and check back later, or deploy to Inference Endpoints (dedicated) instead.

Model tree for cnmoro/micro-bertim-embeddings

Finetuned
(3)
this model

Dataset used to train cnmoro/micro-bertim-embeddings