ELVISIO's picture
Update README.md
25c9c4c verified
metadata
tags:
  - sentence-transformers
  - sentence-similarity
  - feature-extraction
  - generated_from_trainer
  - dataset_size:10000
  - loss:OnlineContrastiveLoss
base_model: jinaai/jina-embeddings-v3
widget:
  - source_sentence: >-
      i be try to picture the pitch for dark angel . i be think matrix and i be
      think bladerunner and i be think that chick that play faith in angel and
      wear shiny black leather or some chick just like her and leave that one
      with u . only get this . we will do it without any plot and dialogue and
      character and decent action or budget and just some loud bang and a hot
      chick in shiny black leather straddle a big throbbing bike . fanboys dig
      loud bang and hot chick in shiny black leather straddle big throbbing bike
      and right . flashy and shallow and dreary and formulaic and passionless
      and tedious and dull and dumb and humourless and desultory and barely
      competent . live action anime without any action and or indeed any life .
      sf just the way joe fanboy like it and in fact . negative .
    sentences:
      - This is a semantically positive review.
      - This is a semantically negative review.
      - This is a semantically positive review.
  - source_sentence: >-
      despite the high rating give to this film by imdb user and this be nothing
      more than your typical girl with a bad childhood obsessively stalks
      married man film . the attractive justine priestly brief nude scene may
      attract voyeur and but the film be hackneyed tripe . half out of .
    sentences:
      - This is a semantically positive review.
      - This is a semantically positive review.
      - This is a semantically positive review.
  - source_sentence: >-
      this movie portray ruth a a womanizing and hard drinking and gambling and
      overeat sport figure with a little baseball thrown in . babe ruth early
      life be quite interesting and this be for all intent and purpose be omit
      in this film . also and lou gehrig be barely cover and this be a well know
      relationship and good bad or indifferent and it should have be cover well
      than it be . his life be more than all bad . he be an american hero and an
      icon that a lot of baseball great pattern their life after . i feel that i
      be be fair to the memory of a great baseball player that this film
      completely ignore . shame on the maker of this film for capitalize on his
      fault and not his greatness .
    sentences:
      - This is a semantically positive review.
      - This is a semantically negative review.
      - This is a semantically positive review.
  - source_sentence: >-
      the silent one panel cartoon henry come to fleischer studio and bill a the
      world funny human in this dull little cartoon . betty and long past her
      prime and thanks to the production code and be run a pet shop and leave
      henry in charge for far too long five minute . a bore .
    sentences:
      - This is a semantically positive review.
      - This is a semantically negative review.
      - This is a semantically negative review.
  - source_sentence: >-
      zu warrior most definitely should have be an animated series because a a
      movie it like watch an old anime on acid . the movie just start out of
      nowhere and people just fly around fight with metal wing and other stupid
      weapon until this princess sacrifice herself for her lover on a cloud or
      something . whether this princess be a god or an angel be beyond me but
      soon enough this fly wind bad guy come in and kill her while the guy with
      the razor wing fight some other mystical god or demon or wizard thing .
      the plot line be either not there or extremely hard to follow you need to
      be insanely intelligent to get this movie . the plot soon follow this
      chinese mortal who be call upon by this god to fight the evil flying and
      princess kill bad guy and soon we have a very badly choreograph uwe boll
      like fight scene complete with terrible martial art on a mountain or
      something . even the visuals be weird some might say they be stun and
      colorful but i be go to say they be blurry and acid trip like ( yes that a
      word . ) . i watch it both dub and with subtitle and both be equally bad
      and hard to understand . who be i kidding i do not understand it at all .
      it felt like i be watch episode 30 of some 1980 anime and completely miss
      how the story begin or like i start read a comic series of 5 at number 4
      because i have no clue how this thing start where it be go or how it would
      end i be lose the entire time . i can honestly say this be one of the bad
      film experience ever it be like watch inu yasha at episode 134 drunk .
      yeah that right you do not know what the hell be go on . don not waste
      your brain try to figure this out .
    sentences:
      - This is a semantically positive review.
      - This is a semantically negative review.
      - This is a semantically positive review.
pipeline_tag: sentence-similarity
library_name: sentence-transformers

SentenceTransformer based on jinaai/jina-embeddings-v3

This is a sentence-transformers model finetuned from jinaai/jina-embeddings-v3. It maps sentences & paragraphs to a 1024-dimensional dense vector space and can be used for semantic textual similarity, semantic search, paraphrase mining, text classification, clustering, and more.

Model Details

Model Description

  • Model Type: Sentence Transformer
  • Base model: jinaai/jina-embeddings-v3
  • Maximum Sequence Length: 8194 tokens
  • Output Dimensionality: 1024 tokens
  • Similarity Function: Cosine Similarity

Model Sources

Full Model Architecture

SentenceTransformer(
  (transformer): Transformer(
    (auto_model): XLMRobertaLoRA(
      (roberta): XLMRobertaModel(
        (embeddings): XLMRobertaEmbeddings(
          (word_embeddings): ParametrizedEmbedding(
            250002, 1024, padding_idx=1
            (parametrizations): ModuleDict(
              (weight): ParametrizationList(
                (0): LoRAParametrization()
              )
            )
          )
          (token_type_embeddings): ParametrizedEmbedding(
            1, 1024
            (parametrizations): ModuleDict(
              (weight): ParametrizationList(
                (0): LoRAParametrization()
              )
            )
          )
        )
        (emb_drop): Dropout(p=0.1, inplace=False)
        (emb_ln): LayerNorm((1024,), eps=1e-05, elementwise_affine=True)
        (encoder): XLMRobertaEncoder(
          (layers): ModuleList(
            (0-23): 24 x Block(
              (mixer): MHA(
                (rotary_emb): RotaryEmbedding()
                (Wqkv): ParametrizedLinearResidual(
                  in_features=1024, out_features=3072, bias=True
                  (parametrizations): ModuleDict(
                    (weight): ParametrizationList(
                      (0): LoRAParametrization()
                    )
                  )
                )
                (inner_attn): FlashSelfAttention(
                  (drop): Dropout(p=0.1, inplace=False)
                )
                (inner_cross_attn): FlashCrossAttention(
                  (drop): Dropout(p=0.1, inplace=False)
                )
                (out_proj): ParametrizedLinear(
                  in_features=1024, out_features=1024, bias=True
                  (parametrizations): ModuleDict(
                    (weight): ParametrizationList(
                      (0): LoRAParametrization()
                    )
                  )
                )
              )
              (dropout1): Dropout(p=0.1, inplace=False)
              (drop_path1): StochasticDepth(p=0.0, mode=row)
              (norm1): LayerNorm((1024,), eps=1e-05, elementwise_affine=True)
              (mlp): Mlp(
                (fc1): ParametrizedLinear(
                  in_features=1024, out_features=4096, bias=True
                  (parametrizations): ModuleDict(
                    (weight): ParametrizationList(
                      (0): LoRAParametrization()
                    )
                  )
                )
                (fc2): ParametrizedLinear(
                  in_features=4096, out_features=1024, bias=True
                  (parametrizations): ModuleDict(
                    (weight): ParametrizationList(
                      (0): LoRAParametrization()
                    )
                  )
                )
              )
              (dropout2): Dropout(p=0.1, inplace=False)
              (drop_path2): StochasticDepth(p=0.0, mode=row)
              (norm2): LayerNorm((1024,), eps=1e-05, elementwise_affine=True)
            )
          )
        )
        (pooler): XLMRobertaPooler(
          (dense): ParametrizedLinear(
            in_features=1024, out_features=1024, bias=True
            (parametrizations): ModuleDict(
              (weight): ParametrizationList(
                (0): LoRAParametrization()
              )
            )
          )
          (activation): Tanh()
        )
      )
    )
  )
  (pooler): Pooling({'word_embedding_dimension': 1024, 'pooling_mode_cls_token': False, 'pooling_mode_mean_tokens': True, 'pooling_mode_max_tokens': False, 'pooling_mode_mean_sqrt_len_tokens': False, 'pooling_mode_weightedmean_tokens': False, 'pooling_mode_lasttoken': False, 'include_prompt': True})
  (normalizer): Normalize()
)

Usage

Direct Usage (Sentence Transformers)

First install the Sentence Transformers library:

pip install -U sentence-transformers

Then you can load this model and run inference.

from sentence_transformers import SentenceTransformer

# Download from the 🤗 Hub
model = SentenceTransformer("ELVISIO/jina_embeddings_v3_finetuned_online_contrastive_01", trust_remote_code=True, model_kwargs={'default_task': 'classification'})
# Run inference
sentences = [
    'zu warrior most definitely should have be an animated series because a a movie it like watch an old anime on acid . the movie just start out of nowhere and people just fly around fight with metal wing and other stupid weapon until this princess sacrifice herself for her lover on a cloud or something . whether this princess be a god or an angel be beyond me but soon enough this fly wind bad guy come in and kill her while the guy with the razor wing fight some other mystical god or demon or wizard thing . the plot line be either not there or extremely hard to follow you need to be insanely intelligent to get this movie . the plot soon follow this chinese mortal who be call upon by this god to fight the evil flying and princess kill bad guy and soon we have a very badly choreograph uwe boll like fight scene complete with terrible martial art on a mountain or something . even the visuals be weird some might say they be stun and colorful but i be go to say they be blurry and acid trip like ( yes that a word . ) . i watch it both dub and with subtitle and both be equally bad and hard to understand . who be i kidding i do not understand it at all . it felt like i be watch episode 30 of some 1980 anime and completely miss how the story begin or like i start read a comic series of 5 at number 4 because i have no clue how this thing start where it be go or how it would end i be lose the entire time . i can honestly say this be one of the bad film experience ever it be like watch inu yasha at episode 134 drunk . yeah that right you do not know what the hell be go on . don not waste your brain try to figure this out .',
    'This is a semantically negative review.',
    'This is a semantically positive review.',
]
embeddings = model.encode(sentences)
print(embeddings.shape)
# [3, 1024]

# Get the similarity scores for the embeddings
similarities = model.similarity(embeddings, embeddings)
print(similarities.shape)
# [3, 3]

Training Details

Training Dataset

Unnamed Dataset

  • Size: 10000 training samples
  • Columns: sentence1, sentence2, and label
  • Approximate statistics based on the first 1000 samples:
    sentence1 sentence2 label
    type string string float
    details
    • min: 19 tokens
    • mean: 300.92 tokens
    • max: 1415 tokens
    • min: 11 tokens
    • mean: 11.0 tokens
    • max: 11 tokens
    • min: 0.0
    • mean: 0.5
    • max: 1.0
  • Samples:
    sentence1 sentence2 label
    i rent i be curious yellow from my video store because of all the controversy that surround it when it be first release in 1967. i also hear that at first it be seize by u. s. custom if it ever try to enter this country and therefore be a fan of film consider controversial i really have to see this for myself . the plot be center around a young swedish drama student name lena who want to learn everything she can about life . in particular she want to focus her attention to make some sort of documentary on what the average swede think about certain political issue such a the vietnam war and race issue in the united state . in between ask politician and ordinary denizen of stockholm about their opinion on politics and she have sex with her drama teacher and classmate and and marry men . what kill me about i be curious yellow be that 40 year ago and this be consider pornographic . really and the sex and nudity scene be few and far between and even then it not shot like some cheaply make porno . while my countryman mind find it shock and in reality sex and nudity be a major staple in swedish cinema . even ingmar bergman and arguably their answer to good old boy john ford and have sex scene in his film . i do commend the filmmaker for the fact that any sex show in the film be show for artistic purpose rather than just to shock people and make money to be show in pornographic theater in america . i be curious yellow be a good film for anyone want to study the meat and potato ( no pun intend ) of swedish cinema . but really and this film doesn not have much of a plot . This is a semantically negative review. 1.0
    i rent i be curious yellow from my video store because of all the controversy that surround it when it be first release in 1967. i also hear that at first it be seize by u. s. custom if it ever try to enter this country and therefore be a fan of film consider controversial i really have to see this for myself . the plot be center around a young swedish drama student name lena who want to learn everything she can about life . in particular she want to focus her attention to make some sort of documentary on what the average swede think about certain political issue such a the vietnam war and race issue in the united state . in between ask politician and ordinary denizen of stockholm about their opinion on politics and she have sex with her drama teacher and classmate and and marry men . what kill me about i be curious yellow be that 40 year ago and this be consider pornographic . really and the sex and nudity scene be few and far between and even then it not shot like some cheaply make porno . while my countryman mind find it shock and in reality sex and nudity be a major staple in swedish cinema . even ingmar bergman and arguably their answer to good old boy john ford and have sex scene in his film . i do commend the filmmaker for the fact that any sex show in the film be show for artistic purpose rather than just to shock people and make money to be show in pornographic theater in america . i be curious yellow be a good film for anyone want to study the meat and potato ( no pun intend ) of swedish cinema . but really and this film doesn not have much of a plot . This is a semantically positive review. 0.0
    i be curious represent yellow be a risible and pretentious steam pile . it doesn not matter what one political view be because this film can hardly be take seriously on any level . a for the claim that frontal male nudity be an automatic nc 17 and that isn not true . i have see r rat film with male nudity . grant and they only offer some fleeting view and but where be the r rat film with gap vulva and flap labium . nowhere and because they do not exist . the same go for those crappy cable show represent schlongs swing in the breeze but not a clitoris in sight . and those pretentious indie movie like the brown bunny and in which be treat to the site of vincent gallo throb johnson and but not a trace of pink visible on chloe sevigny . before cry ( or imply ) double standard in matter of nudity and the mentally obtuse should take into account one unavoidably obvious anatomical difference between men and woman represent there be no genitals on display when actresses appear nude and and the same can not be say for a man . in fact and you generally would not see female genitals in an american film in anything short of porn or explicit erotica . this allege double standard be less a double standard than an admittedly depressing ability to come to term culturally with the inside of woman body . This is a semantically negative review. 1.0
  • Loss: OnlineContrastiveLoss

Training Hyperparameters

Non-Default Hyperparameters

  • per_device_train_batch_size: 64
  • per_device_eval_batch_size: 64

All Hyperparameters

Click to expand
  • overwrite_output_dir: False
  • do_predict: False
  • eval_strategy: no
  • prediction_loss_only: True
  • per_device_train_batch_size: 64
  • per_device_eval_batch_size: 64
  • per_gpu_train_batch_size: None
  • per_gpu_eval_batch_size: None
  • gradient_accumulation_steps: 1
  • eval_accumulation_steps: None
  • torch_empty_cache_steps: None
  • learning_rate: 5e-05
  • weight_decay: 0.0
  • adam_beta1: 0.9
  • adam_beta2: 0.999
  • adam_epsilon: 1e-08
  • max_grad_norm: 1.0
  • num_train_epochs: 3.0
  • max_steps: -1
  • lr_scheduler_type: linear
  • lr_scheduler_kwargs: {}
  • warmup_ratio: 0.0
  • warmup_steps: 0
  • log_level: passive
  • log_level_replica: warning
  • log_on_each_node: True
  • logging_nan_inf_filter: True
  • save_safetensors: True
  • save_on_each_node: False
  • save_only_model: False
  • restore_callback_states_from_checkpoint: False
  • no_cuda: False
  • use_cpu: False
  • use_mps_device: False
  • seed: 42
  • data_seed: None
  • jit_mode_eval: False
  • use_ipex: False
  • bf16: False
  • fp16: False
  • fp16_opt_level: O1
  • half_precision_backend: auto
  • bf16_full_eval: False
  • fp16_full_eval: False
  • tf32: None
  • local_rank: 0
  • ddp_backend: None
  • tpu_num_cores: None
  • tpu_metrics_debug: False
  • debug: []
  • dataloader_drop_last: False
  • dataloader_num_workers: 0
  • dataloader_prefetch_factor: None
  • past_index: -1
  • disable_tqdm: False
  • remove_unused_columns: True
  • label_names: None
  • load_best_model_at_end: False
  • ignore_data_skip: False
  • fsdp: []
  • fsdp_min_num_params: 0
  • fsdp_config: {'min_num_params': 0, 'xla': False, 'xla_fsdp_v2': False, 'xla_fsdp_grad_ckpt': False}
  • fsdp_transformer_layer_cls_to_wrap: None
  • accelerator_config: {'split_batches': False, 'dispatch_batches': None, 'even_batches': True, 'use_seedable_sampler': True, 'non_blocking': False, 'gradient_accumulation_kwargs': None}
  • deepspeed: None
  • label_smoothing_factor: 0.0
  • optim: adamw_torch
  • optim_args: None
  • adafactor: False
  • group_by_length: False
  • length_column_name: length
  • ddp_find_unused_parameters: None
  • ddp_bucket_cap_mb: None
  • ddp_broadcast_buffers: False
  • dataloader_pin_memory: True
  • dataloader_persistent_workers: False
  • skip_memory_metrics: True
  • use_legacy_prediction_loop: False
  • push_to_hub: False
  • resume_from_checkpoint: None
  • hub_model_id: None
  • hub_strategy: every_save
  • hub_private_repo: False
  • hub_always_push: False
  • gradient_checkpointing: False
  • gradient_checkpointing_kwargs: None
  • include_inputs_for_metrics: False
  • eval_do_concat_batches: True
  • fp16_backend: auto
  • push_to_hub_model_id: None
  • push_to_hub_organization: None
  • mp_parameters:
  • auto_find_batch_size: False
  • full_determinism: False
  • torchdynamo: None
  • ray_scope: last
  • ddp_timeout: 1800
  • torch_compile: False
  • torch_compile_backend: None
  • torch_compile_mode: None
  • dispatch_batches: None
  • split_batches: None
  • include_tokens_per_second: False
  • include_num_input_tokens_seen: False
  • neftune_noise_alpha: None
  • optim_target_modules: None
  • batch_eval_metrics: False
  • eval_on_start: False
  • use_liger_kernel: False
  • eval_use_gather_object: False
  • batch_sampler: batch_sampler
  • multi_dataset_batch_sampler: proportional

Training Logs

Epoch Step Training Loss
0.6394 500 0.9485
1.2788 1000 0.6908
1.9182 1500 0.7048
2.5575 2000 0.6892

Framework Versions

  • Python: 3.10.12
  • Sentence Transformers: 3.1.1
  • Transformers: 4.45.2
  • PyTorch: 2.5.1+cu121
  • Accelerate: 1.1.1
  • Datasets: 3.1.0
  • Tokenizers: 0.20.3

Citation

BibTeX

Sentence Transformers

@inproceedings{reimers-2019-sentence-bert,
    title = "Sentence-BERT: Sentence Embeddings using Siamese BERT-Networks",
    author = "Reimers, Nils and Gurevych, Iryna",
    booktitle = "Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing",
    month = "11",
    year = "2019",
    publisher = "Association for Computational Linguistics",
    url = "https://arxiv.org/abs/1908.10084",
}