BAAI/bge-large-en-v1.5 trained for News Type Classification

This is a Cross Encoder model finetuned from BAAI/bge-large-en-v1.5 using the sentence-transformers library. It computes scores for pairs of texts, which can be used for text pair classification.

Model Details

Model Description

  • Model Type: Cross Encoder
  • Base model: BAAI/bge-large-en-v1.5
  • Maximum Sequence Length: 512 tokens
  • Number of Output Labels: 10 labels
  • Language: en
  • License: apache-2.0

Model Sources

Usage

Direct Usage (Sentence Transformers)

First install the Sentence Transformers library:

pip install -U sentence-transformers

Then you can load this model and run inference.

from sentence_transformers import CrossEncoder

# Download from the 🤗 Hub
model = CrossEncoder("cross_encoder_model_id")
# Get scores for pairs of texts
pairs = [
    ["Dennis Schroder potential trade piece, but Brooklyn Nets won't just give him away", 'Basketball: Pre Game - Entity: Dennis Schroder'],
    ['The Vancouver Canucks are currently facing a cap crunch after the Lafferty pickup (Sam) Lafferty.', 'Hockey: Pre Game - Entity: Lafferty'],
    ['Additional news for 12/11 Out: Moses Moody Questionable: Steven Adams Not on injury report: Tari Eason', 'Basketball: Pre Game - Entity: Moses Moody'],
    ['Additional news for 12/11 Out: Moses Moody Questionable: Steven Adams Not on injury report: Tari Eason', 'Basketball: Pre Game - Entity: Steven Adams'],
    ['Additional news for 12/11 Out: Moses Moody Questionable: Steven Adams Not on injury report: Tari Eason', 'Basketball: Pre Game - Entity: Tari Eason'],
]
scores = model.predict(pairs)
print(scores.shape)
# (5, 10)

Training Details

Training Dataset

Unnamed Dataset

  • Size: 7,434 training samples
  • Columns: sentence1, sentence2, and label
  • Approximate statistics based on the first 1000 samples:
    sentence1 sentence2 label
    type string string list
    details
    • min: 13 characters
    • mean: 141.82 characters
    • max: 320 characters
    • min: 31 characters
    • mean: 39.51 characters
    • max: 53 characters
    • size: 10 elements
  • Samples:
    sentence1 sentence2 label
    Dennis Schroder potential trade piece, but Brooklyn Nets won't just give him away Basketball: Pre Game - Entity: Dennis Schroder [0.0, 0.0, 0.0, 0.0, 0.0, ...]
    The Vancouver Canucks are currently facing a cap crunch after the Lafferty pickup (Sam) Lafferty. Hockey: Pre Game - Entity: Lafferty [0.0, 0.0, 0.0, 1.0, 0.0, ...]
    Additional news for 12/11 Out: Moses Moody Questionable: Steven Adams Not on injury report: Tari Eason Basketball: Pre Game - Entity: Moses Moody [0.0, 1.0, 0.0, 0.0, 1.0, ...]
  • Loss: main.MultiLabelCrossEntropyLoss

Training Hyperparameters

Non-Default Hyperparameters

  • eval_strategy: steps
  • per_device_train_batch_size: 16
  • per_device_eval_batch_size: 16
  • learning_rate: 2e-05
  • num_train_epochs: 10
  • warmup_ratio: 0.1
  • load_best_model_at_end: True

All Hyperparameters

Click to expand
  • overwrite_output_dir: False
  • do_predict: False
  • eval_strategy: steps
  • prediction_loss_only: True
  • per_device_train_batch_size: 16
  • per_device_eval_batch_size: 16
  • per_gpu_train_batch_size: None
  • per_gpu_eval_batch_size: None
  • gradient_accumulation_steps: 1
  • eval_accumulation_steps: None
  • torch_empty_cache_steps: None
  • learning_rate: 2e-05
  • weight_decay: 0.0
  • adam_beta1: 0.9
  • adam_beta2: 0.999
  • adam_epsilon: 1e-08
  • max_grad_norm: 1.0
  • num_train_epochs: 10
  • max_steps: -1
  • lr_scheduler_type: linear
  • lr_scheduler_kwargs: {}
  • warmup_ratio: 0.1
  • warmup_steps: 0
  • log_level: passive
  • log_level_replica: warning
  • log_on_each_node: True
  • logging_nan_inf_filter: True
  • save_safetensors: True
  • save_on_each_node: False
  • save_only_model: False
  • restore_callback_states_from_checkpoint: False
  • no_cuda: False
  • use_cpu: False
  • use_mps_device: False
  • seed: 42
  • data_seed: None
  • jit_mode_eval: False
  • use_ipex: False
  • bf16: False
  • fp16: False
  • fp16_opt_level: O1
  • half_precision_backend: auto
  • bf16_full_eval: False
  • fp16_full_eval: False
  • tf32: None
  • local_rank: 0
  • ddp_backend: None
  • tpu_num_cores: None
  • tpu_metrics_debug: False
  • debug: []
  • dataloader_drop_last: False
  • dataloader_num_workers: 0
  • dataloader_prefetch_factor: None
  • past_index: -1
  • disable_tqdm: False
  • remove_unused_columns: True
  • label_names: None
  • load_best_model_at_end: True
  • ignore_data_skip: False
  • fsdp: []
  • fsdp_min_num_params: 0
  • fsdp_config: {'min_num_params': 0, 'xla': False, 'xla_fsdp_v2': False, 'xla_fsdp_grad_ckpt': False}
  • tp_size: 0
  • fsdp_transformer_layer_cls_to_wrap: None
  • accelerator_config: {'split_batches': False, 'dispatch_batches': None, 'even_batches': True, 'use_seedable_sampler': True, 'non_blocking': False, 'gradient_accumulation_kwargs': None}
  • deepspeed: None
  • label_smoothing_factor: 0.0
  • optim: adamw_torch
  • optim_args: None
  • adafactor: False
  • group_by_length: False
  • length_column_name: length
  • ddp_find_unused_parameters: None
  • ddp_bucket_cap_mb: None
  • ddp_broadcast_buffers: False
  • dataloader_pin_memory: True
  • dataloader_persistent_workers: False
  • skip_memory_metrics: True
  • use_legacy_prediction_loop: False
  • push_to_hub: False
  • resume_from_checkpoint: None
  • hub_model_id: None
  • hub_strategy: every_save
  • hub_private_repo: None
  • hub_always_push: False
  • gradient_checkpointing: False
  • gradient_checkpointing_kwargs: None
  • include_inputs_for_metrics: False
  • include_for_metrics: []
  • eval_do_concat_batches: True
  • fp16_backend: auto
  • push_to_hub_model_id: None
  • push_to_hub_organization: None
  • mp_parameters:
  • auto_find_batch_size: False
  • full_determinism: False
  • torchdynamo: None
  • ray_scope: last
  • ddp_timeout: 1800
  • torch_compile: False
  • torch_compile_backend: None
  • torch_compile_mode: None
  • include_tokens_per_second: False
  • include_num_input_tokens_seen: False
  • neftune_noise_alpha: None
  • optim_target_modules: None
  • batch_eval_metrics: False
  • eval_on_start: False
  • use_liger_kernel: False
  • eval_use_gather_object: False
  • average_tokens_across_devices: False
  • prompts: None
  • batch_sampler: batch_sampler
  • multi_dataset_batch_sampler: proportional
  • router_mapping: {}
  • learning_rate_mapping: {}

Training Logs

Epoch Step Training Loss
0.0022 1 1.0293
0.4301 200 0.9024
0.8602 400 0.4968
1.2903 600 0.3217
1.7204 800 0.2909
2.1505 1000 0.2133
2.5806 1200 0.1821
3.0108 1400 0.1737
3.4409 1600 0.1152
3.8710 1800 0.1224
4.3011 2000 0.1019
4.7312 2200 0.0909
5.1613 2400 0.0806
5.5914 2600 0.0634
6.0215 2800 0.0689
6.4516 3000 0.0503
6.8817 3200 0.0532
7.3118 3400 0.0453
7.7419 3600 0.0405
8.1720 3800 0.032
8.6022 4000 0.029
9.0323 4200 0.032
9.4624 4400 0.0246
9.8925 4600 0.0245
  • The bold row denotes the saved checkpoint.

Framework Versions

  • Python: 3.10.18
  • Sentence Transformers: 5.1.1
  • Transformers: 4.51.0
  • PyTorch: 2.8.0
  • Accelerate: 1.10.1
  • Datasets: 4.1.1
  • Tokenizers: 0.21.4

Citation

BibTeX

Sentence Transformers

@inproceedings{reimers-2019-sentence-bert,
    title = "Sentence-BERT: Sentence Embeddings using Siamese BERT-Networks",
    author = "Reimers, Nils and Gurevych, Iryna",
    booktitle = "Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing",
    month = "11",
    year = "2019",
    publisher = "Association for Computational Linguistics",
    url = "https://arxiv.org/abs/1908.10084",
}
Downloads last month
1
Safetensors
Model size
0.3B params
Tensor type
F32
·
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for corleymj/snippet-entity-news-classification-v2

Finetuned
(82)
this model

Paper for corleymj/snippet-entity-news-classification-v2