Sentence-BERT: Sentence Embeddings using Siamese BERT-Networks
Paper • 1908.10084 • Published • 15
How to use WealthFromAI/empire-embed with sentence-transformers:
from sentence_transformers import SentenceTransformer
model = SentenceTransformer("WealthFromAI/empire-embed")
sentences = [
"brain scan",
"Full WordPress Site Health Audit. Check plugins — installed, active, and update status command Verify SEO configuration command Test page speed with Lighthouse command Security check — WordFence scan and login protection check Test REST API connectivity and credentials command Check Google Search Console for crawl errors and ranking prompt. Tags: wordpress, audit, seo, security, performance, plugins, health",
"Bulk Article Creation for a Site. Research topics and identify content gaps command Generate content outlines and briefs command Write content using ZimmWriter or scripts prompt Generate featured images for all articles command Publish articles and set featured images command. Tags: content, articles, bulk, wordpress, seo, publishing, zimmwriter",
"Run EMPIRE-BRAIN Scan and Intelligence Cycle. Run full empire brain scan command Generate intelligence briefing command Check brain stats and performance metrics command Run evolution cycle command Verify Sentinel service monitoring status command. Tags: empire, brain, scan, intelligence, monitoring, evolution, briefing"
]
embeddings = model.encode(sentences)
similarities = model.similarity(embeddings, embeddings)
print(similarities.shape)
# [4, 4]This is a sentence-transformers model finetuned from sentence-transformers/all-MiniLM-L6-v2. It maps sentences & paragraphs to a 384-dimensional dense vector space and can be used for semantic textual similarity, semantic search, paraphrase mining, text classification, clustering, and more.
SentenceTransformer(
(0): Transformer({'max_seq_length': 256, 'do_lower_case': False, 'architecture': 'BertModel'})
(1): Pooling({'word_embedding_dimension': 384, 'pooling_mode_cls_token': False, 'pooling_mode_mean_tokens': True, 'pooling_mode_max_tokens': False, 'pooling_mode_mean_sqrt_len_tokens': False, 'pooling_mode_weightedmean_tokens': False, 'pooling_mode_lasttoken': False, 'include_prompt': True})
(2): Normalize()
)
First install the Sentence Transformers library:
pip install -U sentence-transformers
Then you can load this model and run inference.
from sentence_transformers import SentenceTransformer
# Download from the 🤗 Hub
model = SentenceTransformer("WealthFromAI/empire-embed")
# Run inference
sentences = [
'batch articles',
'Bulk Article Creation for a Site. Research topics and identify content gaps command Generate content outlines and briefs command Write content using ZimmWriter or scripts prompt Generate featured images for all articles command Publish articles and set featured images command. Tags: content, articles, bulk, wordpress, seo, publishing, zimmwriter',
'Site Speed & Core Web Vitals Optimization. Run Lighthouse audit command Configure LiteSpeed Cache (all sites use it) command Image optimization check Font and render-blocking resource optimization check Verify improvements and monitor command. Tags: seo, speed, performance, core-web-vitals, litespeed, images, caching',
]
embeddings = model.encode(sentences)
print(embeddings.shape)
# [3, 384]
# Get the similarity scores for the embeddings
similarities = model.similarity(embeddings, embeddings)
print(similarities)
# tensor([[1.0000, 0.7041, 0.2289],
# [0.7041, 1.0000, 0.3787],
# [0.2289, 0.3787, 1.0000]])
empire-evalEmbeddingSimilarityEvaluator| Metric | Value |
|---|---|
| pearson_cosine | 0.5531 |
| spearman_cosine | 0.532 |
sentence_0, sentence_1, and label| sentence_0 | sentence_1 | label | |
|---|---|---|---|
| type | string | string | float |
| details |
|
|
|
| sentence_0 | sentence_1 | label |
|---|---|---|
bootstrap project |
Bootstrap a New Empire Project. Create project directory and initialize git command Create PROJECT_DNA.md command Create CLAUDE.md with project-specific config prompt Set up Python environment command Create initial git commit and push to GitHub command Register project in EMPIRE-BRAIN command. Tags: empire, project, bootstrap, setup, new |
1.0 |
update container |
WordPress Site SEO Setup & Configuration. Verify RankMath SEO plugin is installed and activated check Configure RankMath general settings prompt Set up Schema markup patterns per post type prompt Configure robots.txt and sitemap command Set up internal linking structure prompt Configure affiliate link handling prompt. Tags: wordpress, seo, rankmath, schema, setup |
0.0 |
restart service |
Full WordPress Site Health Audit. Check plugins — installed, active, and update status command Verify SEO configuration command Test page speed with Lighthouse command Security check — WordFence scan and login protection check Test REST API connectivity and credentials command Check Google Search Console for crawl errors and ranking prompt. Tags: wordpress, audit, seo, security, performance, plugins, health |
0.0 |
CosineSimilarityLoss with these parameters:{
"loss_fct": "torch.nn.modules.loss.MSELoss"
}
per_device_train_batch_size: 16num_train_epochs: 5eval_strategy: stepsper_device_eval_batch_size: 16multi_dataset_batch_sampler: round_robinper_device_train_batch_size: 16num_train_epochs: 5max_steps: -1learning_rate: 5e-05lr_scheduler_type: linearlr_scheduler_kwargs: Nonewarmup_steps: 0optim: adamw_torch_fusedoptim_args: Noneweight_decay: 0.0adam_beta1: 0.9adam_beta2: 0.999adam_epsilon: 1e-08optim_target_modules: Nonegradient_accumulation_steps: 1average_tokens_across_devices: Truemax_grad_norm: 1label_smoothing_factor: 0.0bf16: Falsefp16: Falsebf16_full_eval: Falsefp16_full_eval: Falsetf32: Nonegradient_checkpointing: Falsegradient_checkpointing_kwargs: Nonetorch_compile: Falsetorch_compile_backend: Nonetorch_compile_mode: Noneuse_liger_kernel: Falseliger_kernel_config: Noneuse_cache: Falseneftune_noise_alpha: Nonetorch_empty_cache_steps: Noneauto_find_batch_size: Falselog_on_each_node: Truelogging_nan_inf_filter: Trueinclude_num_input_tokens_seen: nolog_level: passivelog_level_replica: warningdisable_tqdm: Falseproject: huggingfacetrackio_space_id: trackioeval_strategy: stepsper_device_eval_batch_size: 16prediction_loss_only: Trueeval_on_start: Falseeval_do_concat_batches: Trueeval_use_gather_object: Falseeval_accumulation_steps: Noneinclude_for_metrics: []batch_eval_metrics: Falsesave_only_model: Falsesave_on_each_node: Falseenable_jit_checkpoint: Falsepush_to_hub: Falsehub_private_repo: Nonehub_model_id: Nonehub_strategy: every_savehub_always_push: Falsehub_revision: Noneload_best_model_at_end: Falseignore_data_skip: Falserestore_callback_states_from_checkpoint: Falsefull_determinism: Falseseed: 42data_seed: Noneuse_cpu: Falseaccelerator_config: {'split_batches': False, 'dispatch_batches': None, 'even_batches': True, 'use_seedable_sampler': True, 'non_blocking': False, 'gradient_accumulation_kwargs': None}parallelism_config: Nonedataloader_drop_last: Falsedataloader_num_workers: 0dataloader_pin_memory: Truedataloader_persistent_workers: Falsedataloader_prefetch_factor: Noneremove_unused_columns: Truelabel_names: Nonetrain_sampling_strategy: randomlength_column_name: lengthddp_find_unused_parameters: Noneddp_bucket_cap_mb: Noneddp_broadcast_buffers: Falseddp_backend: Noneddp_timeout: 1800fsdp: []fsdp_config: {'min_num_params': 0, 'xla': False, 'xla_fsdp_v2': False, 'xla_fsdp_grad_ckpt': False}deepspeed: Nonedebug: []skip_memory_metrics: Truedo_predict: Falseresume_from_checkpoint: Nonewarmup_ratio: Nonelocal_rank: -1prompts: Nonebatch_sampler: batch_samplermulti_dataset_batch_sampler: round_robinrouter_mapping: {}learning_rate_mapping: {}| Epoch | Step | empire-eval_spearman_cosine |
|---|---|---|
| 1.0 | 21 | 0.5320 |
@inproceedings{reimers-2019-sentence-bert,
title = "Sentence-BERT: Sentence Embeddings using Siamese BERT-Networks",
author = "Reimers, Nils and Gurevych, Iryna",
booktitle = "Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing",
month = "11",
year = "2019",
publisher = "Association for Computational Linguistics",
url = "https://arxiv.org/abs/1908.10084",
}
Base model
nreimers/MiniLM-L6-H384-uncased