Sentence Similarity
sentence-transformers
Safetensors
Latin
roberta
feature-extraction
Generated from Trainer
dataset_size:183227
loss:MaskedDenoisingAutoEncoderLoss
text-embeddings-inference
Instructions to use TdelaSelle/PatriLaSE with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- sentence-transformers
How to use TdelaSelle/PatriLaSE with sentence-transformers:
from sentence_transformers import SentenceTransformer model = SentenceTransformer("TdelaSelle/PatriLaSE") sentences = [ "The weather is lovely today.", "It's so sunny outside!", "He drove to the stadium." ] embeddings = model.encode(sentences) similarities = model.similarity(embeddings, embeddings) print(similarities.shape) # [3, 3] - Notebooks
- Google Colab
- Kaggle
Patristic Latin Sentence Embeddings
This is a Latin sentence-transformers model finetuned from bowphs/LaBerta. It maps Latin sentences and paragraphs to a dense vector space and can be used for semantic correspondances detection or information retrieval (e.g. for detection of quotations and allusions).
Model Details
Model Description
- Model Type: Sentence Transformer
- Base Model: bowphs/LaBerta
- Maximum Sequence Length: 512 tokens
- Output Dimensionality: 768 dimensions
- Similarity Function: cosine
- Pooling Mode: cls
- Supported Modality: Text
Model Sources
- Documentation: https://sbert.net
- Repository: https://github.com/huggingface/sentence-transformers
- Hugging Face: https://huggingface.co/models?library=sentence-transformers
Full Model Architecture
SentenceTransformer(
(0): Transformer(transformer_task=feature-extraction, architecture=RobertaModel)
(1): Pooling(embedding_dimension=768, pooling_mode=cls)
)
Usage
Direct Usage (Sentence Transformers)
pip install -U sentence-transformers
from sentence_transformers import SentenceTransformer
model = SentenceTransformer("Tdelaselle/PatriLaSE")
sentences = [
"quis ergo sanat omnes languores tuos nisi qui propitius fit omnibus iniquitatibus tuis?",
"qui propitiatur omnibus iniquitatibus tuis qui sanat omnes infirmitates tuas",
]
embeddings = model.encode(sentences)
print(embeddings.shape)
Training Details
Training Dataset
- Size: 183227 training samples
- Corpus Path: Patristic Latin sentences
- Loss: MaskedDenoisingAutoEncoderLoss
Training Hyperparameters
- epochs: 3
- batch_size: 16
- learning_rate: 2e-05
- weight_decay: 0.0
- warmup_ratio: 0.0
Framework Versions
- pytorch: 2.11.0+cu130
- sentence_transformers: 5.4.1
- transformers: 4.57.6
- Downloads last month
- 90
Model tree for TdelaSelle/PatriLaSE
Base model
bowphs/LaBerta