cnmoro
/

bert-tiny-embeddings-english-portuguese

Sentence Similarity

sentence-transformers

feature-extraction

Trained with AutoTrain

text-embeddings-inference

Inference Endpoints

Model card Files Files and versions Metrics Training metrics Community

Model Trained Using AutoTrain

Problem type: Sentence Transformers

Validation Metrics

loss: 0.056979671120643616

Info

This is the bert-tiny model finetuned on 15B tokens for embedding/feature extraction, for English and Brazillian Portuguese languages.

The output vector size is 128.

This model only has 4.4M params but the quality of the embeddings punch way above its size after tuning.

Usage

Direct Usage (Sentence Transformers)

First install the Sentence Transformers library:

pip install -U sentence-transformers

Then you can load this model and run inference.

from sentence_transformers import SentenceTransformer

# Download from the Hugging Face Hub
model = SentenceTransformer("cnmoro/bert-tiny-embeddings-english-portuguese")
# Run inference
sentences = [
    'first passage',
    'second passage'
]
embeddings = model.encode(sentences)
print(embeddings.shape)

Downloads last month: 125

Safetensors

Model size

4.39M params

Tensor type

F32

·

Inference Providers NEW

Sentence Similarity

This model is not currently available via any of the supported Inference Providers.

Model tree for cnmoro/bert-tiny-embeddings-english-portuguese

Base model

google/bert_uncased_L-2_H-128_A-2

Finetuned

(50)

this model