nampham1106 commited on
Commit
93d778d
1 Parent(s): 389d5c3

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +2 -347
README.md CHANGED
@@ -1,20 +1,6 @@
1
  ---
2
  language:
3
- - ar
4
- - bg
5
- - de
6
- - el
7
- - en
8
- - es
9
- - fr
10
- - hi
11
- - ru
12
- - sw
13
- - th
14
- - tr
15
- - ur
16
  - vi
17
- - zh
18
  library_name: sentence-transformers
19
  tags:
20
  - sentence-transformers
@@ -201,7 +187,7 @@ This is a [sentence-transformers](https://www.SBERT.net) model finetuned from [B
201
  - **Similarity Function:** Cosine Similarity
202
  - **Training Dataset:**
203
  - [facebook/xnli](https://huggingface.co/datasets/facebook/xnli)
204
- - **Languages:** ar, bg, de, el, en, es, fr, hi, ru, sw, th, tr, ur, vi, zh
205
  <!-- - **License:** Unknown -->
206
 
207
  ### Model Sources
@@ -235,7 +221,7 @@ Then you can load this model and run inference.
235
  from sentence_transformers import SentenceTransformer
236
 
237
  # Download from the 🤗 Hub
238
- model = SentenceTransformer("matryoshka_nli_BookingCare-bkcare-bert-pretrained-2024-07-19_04-21-48")
239
  # Run inference
240
  sentences = [
241
  'Tôi sẽ làm tất cả những gì ông muốn. julius hạ khẩu súng lục .',
@@ -313,334 +299,3 @@ You can finetune this model on your own dataset.
313
  | spearman_dot | 0.6631 |
314
  | pearson_max | 0.6851 |
315
  | spearman_max | 0.6695 |
316
-
317
- #### Semantic Similarity
318
- * Dataset: `sts-dev-256`
319
- * Evaluated with [<code>EmbeddingSimilarityEvaluator</code>](https://sbert.net/docs/package_reference/sentence_transformer/evaluation.html#sentence_transformers.evaluation.EmbeddingSimilarityEvaluator)
320
-
321
- | Metric | Value |
322
- |:--------------------|:-----------|
323
- | pearson_cosine | 0.6725 |
324
- | **spearman_cosine** | **0.6576** |
325
- | pearson_manhattan | 0.6698 |
326
- | spearman_manhattan | 0.6645 |
327
- | pearson_euclidean | 0.672 |
328
- | spearman_euclidean | 0.667 |
329
- | pearson_dot | 0.6476 |
330
- | spearman_dot | 0.6294 |
331
- | pearson_max | 0.6725 |
332
- | spearman_max | 0.667 |
333
-
334
- <!--
335
- ## Bias, Risks and Limitations
336
-
337
- *What are the known or foreseeable issues stemming from this model? You could also flag here known failure cases or weaknesses of the model.*
338
- -->
339
-
340
- <!--
341
- ### Recommendations
342
-
343
- *What are recommendations with respect to the foreseeable issues? For example, filtering explicit content.*
344
- -->
345
-
346
- ## Training Details
347
-
348
- ### Training Dataset
349
-
350
- #### facebook/xnli
351
-
352
- * Dataset: [facebook/xnli](https://huggingface.co/datasets/facebook/xnli) at [b8dd5d7](https://huggingface.co/datasets/facebook/xnli/tree/b8dd5d7af51114dbda02c0e3f6133f332186418e)
353
- * Size: 388,774 training samples
354
- * Columns: <code>premise</code>, <code>hypothesis</code>, and <code>label</code>
355
- * Approximate statistics based on the first 1000 samples:
356
- | | premise | hypothesis | label |
357
- |:--------|:-----------------------------------------------------------------------------------|:----------------------------------------------------------------------------------|:-------------------------------------------------------------------|
358
- | type | string | string | int |
359
- | details | <ul><li>min: 3 tokens</li><li>mean: 29.98 tokens</li><li>max: 309 tokens</li></ul> | <ul><li>min: 3 tokens</li><li>mean: 15.64 tokens</li><li>max: 61 tokens</li></ul> | <ul><li>0: ~33.10%</li><li>1: ~35.60%</li><li>2: ~31.30%</li></ul> |
360
- * Samples:
361
- | premise | hypothesis | label |
362
- |:------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|:------------------------------------------------------------------------------------------------------------------------------------------|:---------------|
363
- | <code>Những rắc rối với loại phân tích chi tiết này có nghĩa là bất kỳ nghệ nhân nào có thể nghiên cứu kỹ thuật của người nghệ thuật và tái tạo chúng -- sự chuẩn bị của hoffman .</code> | <code>Sự tái tạo là một quá trình dễ dàng .</code> | <code>2</code> |
364
- | <code>Đó là một sự quan sát tỉnh rượu , để nhận ra rằng 80 phần trăm của những người cần sự giúp đỡ pháp lý bị từ chối những hướng dẫn và luật sự .</code> | <code>80 % những người c��n sự trợ giúp pháp lý bị từ chối những hướng dẫn mà họ đang tìm kiếm , và đây là một suy nghĩ tỉnh rượu .</code> | <code>0</code> |
365
- | <code>Đi qua cái để tìm nhà thờ của những hình xăm egios .</code> | <code>Nếu anh đi qua cái , anh sẽ tìm thấy mình ở bờ vực của thị trấn , không có gì ngoài nông thôn bên kia .</code> | <code>2</code> |
366
- * Loss: [<code>MatryoshkaLoss</code>](https://sbert.net/docs/package_reference/sentence_transformer/losses.html#matryoshkaloss) with these parameters:
367
- ```json
368
- {
369
- "loss": "MultipleNegativesRankingLoss",
370
- "matryoshka_dims": [
371
- 768,
372
- 512,
373
- 256
374
- ],
375
- "matryoshka_weights": [
376
- 1,
377
- 1,
378
- 1
379
- ],
380
- "n_dims_per_step": -1
381
- }
382
- ```
383
-
384
- ### Evaluation Dataset
385
-
386
- #### facebook/xnli
387
-
388
- * Dataset: [facebook/xnli](https://huggingface.co/datasets/facebook/xnli) at [b8dd5d7](https://huggingface.co/datasets/facebook/xnli/tree/b8dd5d7af51114dbda02c0e3f6133f332186418e)
389
- * Size: 3,928 evaluation samples
390
- * Columns: <code>premise</code>, <code>hypothesis</code>, and <code>label</code>
391
- * Approximate statistics based on the first 1000 samples:
392
- | | premise | hypothesis | label |
393
- |:--------|:----------------------------------------------------------------------------------|:----------------------------------------------------------------------------------|:-------------------------------------------------------------------|
394
- | type | string | string | int |
395
- | details | <ul><li>min: 4 tokens</li><li>mean: 32.3 tokens</li><li>max: 163 tokens</li></ul> | <ul><li>min: 3 tokens</li><li>mean: 15.73 tokens</li><li>max: 53 tokens</li></ul> | <ul><li>0: ~32.40%</li><li>1: ~33.50%</li><li>2: ~34.10%</li></ul> |
396
- * Samples:
397
- | premise | hypothesis | label |
398
- |:---------------------------------------------------------------------------------------------------------------------------|:---------------------------------------------------------------------------------------|:---------------|
399
- | <code>Hai xu mắt anh ta warily .</code> | <code>Hai xu không nhìn anh ta .</code> | <code>2</code> |
400
- | <code>Một không khí chung của glee permeated tất cả mọi người .</code> | <code>Mọi thứ đều cảm thấy hạnh phúc .</code> | <code>0</code> |
401
- | <code>Tuy nhiên , một sự chắc chắn là dân số hoa kỳ đã bị lão hóa và sẽ có ít công nhân hỗ trợ mỗi người nghỉ hưu .</code> | <code>Trạng Thái lão hóa của dân số hoa kỳ được coi là một sự không chắc chắn .</code> | <code>2</code> |
402
- * Loss: [<code>MatryoshkaLoss</code>](https://sbert.net/docs/package_reference/sentence_transformer/losses.html#matryoshkaloss) with these parameters:
403
- ```json
404
- {
405
- "loss": "MultipleNegativesRankingLoss",
406
- "matryoshka_dims": [
407
- 768,
408
- 512,
409
- 256
410
- ],
411
- "matryoshka_weights": [
412
- 1,
413
- 1,
414
- 1
415
- ],
416
- "n_dims_per_step": -1
417
- }
418
- ```
419
-
420
- ### Training Hyperparameters
421
- #### Non-Default Hyperparameters
422
-
423
- - `eval_strategy`: steps
424
- - `per_device_train_batch_size`: 32
425
- - `per_device_eval_batch_size`: 32
426
- - `learning_rate`: 2e-05
427
- - `num_train_epochs`: 1
428
- - `warmup_ratio`: 0.1
429
- - `fp16`: True
430
- - `batch_sampler`: no_duplicates
431
-
432
- #### All Hyperparameters
433
- <details><summary>Click to expand</summary>
434
-
435
- - `overwrite_output_dir`: False
436
- - `do_predict`: False
437
- - `eval_strategy`: steps
438
- - `prediction_loss_only`: True
439
- - `per_device_train_batch_size`: 32
440
- - `per_device_eval_batch_size`: 32
441
- - `per_gpu_train_batch_size`: None
442
- - `per_gpu_eval_batch_size`: None
443
- - `gradient_accumulation_steps`: 1
444
- - `eval_accumulation_steps`: None
445
- - `learning_rate`: 2e-05
446
- - `weight_decay`: 0.0
447
- - `adam_beta1`: 0.9
448
- - `adam_beta2`: 0.999
449
- - `adam_epsilon`: 1e-08
450
- - `max_grad_norm`: 1.0
451
- - `num_train_epochs`: 1
452
- - `max_steps`: -1
453
- - `lr_scheduler_type`: linear
454
- - `lr_scheduler_kwargs`: {}
455
- - `warmup_ratio`: 0.1
456
- - `warmup_steps`: 0
457
- - `log_level`: passive
458
- - `log_level_replica`: warning
459
- - `log_on_each_node`: True
460
- - `logging_nan_inf_filter`: True
461
- - `save_safetensors`: True
462
- - `save_on_each_node`: False
463
- - `save_only_model`: False
464
- - `restore_callback_states_from_checkpoint`: False
465
- - `no_cuda`: False
466
- - `use_cpu`: False
467
- - `use_mps_device`: False
468
- - `seed`: 42
469
- - `data_seed`: None
470
- - `jit_mode_eval`: False
471
- - `use_ipex`: False
472
- - `bf16`: False
473
- - `fp16`: True
474
- - `fp16_opt_level`: O1
475
- - `half_precision_backend`: auto
476
- - `bf16_full_eval`: False
477
- - `fp16_full_eval`: False
478
- - `tf32`: None
479
- - `local_rank`: 0
480
- - `ddp_backend`: None
481
- - `tpu_num_cores`: None
482
- - `tpu_metrics_debug`: False
483
- - `debug`: []
484
- - `dataloader_drop_last`: False
485
- - `dataloader_num_workers`: 0
486
- - `dataloader_prefetch_factor`: None
487
- - `past_index`: -1
488
- - `disable_tqdm`: False
489
- - `remove_unused_columns`: True
490
- - `label_names`: None
491
- - `load_best_model_at_end`: False
492
- - `ignore_data_skip`: False
493
- - `fsdp`: []
494
- - `fsdp_min_num_params`: 0
495
- - `fsdp_config`: {'min_num_params': 0, 'xla': False, 'xla_fsdp_v2': False, 'xla_fsdp_grad_ckpt': False}
496
- - `fsdp_transformer_layer_cls_to_wrap`: None
497
- - `accelerator_config`: {'split_batches': False, 'dispatch_batches': None, 'even_batches': True, 'use_seedable_sampler': True, 'non_blocking': False, 'gradient_accumulation_kwargs': None}
498
- - `deepspeed`: None
499
- - `label_smoothing_factor`: 0.0
500
- - `optim`: adamw_torch
501
- - `optim_args`: None
502
- - `adafactor`: False
503
- - `group_by_length`: False
504
- - `length_column_name`: length
505
- - `ddp_find_unused_parameters`: None
506
- - `ddp_bucket_cap_mb`: None
507
- - `ddp_broadcast_buffers`: False
508
- - `dataloader_pin_memory`: True
509
- - `dataloader_persistent_workers`: False
510
- - `skip_memory_metrics`: True
511
- - `use_legacy_prediction_loop`: False
512
- - `push_to_hub`: False
513
- - `resume_from_checkpoint`: None
514
- - `hub_model_id`: None
515
- - `hub_strategy`: every_save
516
- - `hub_private_repo`: False
517
- - `hub_always_push`: False
518
- - `gradient_checkpointing`: False
519
- - `gradient_checkpointing_kwargs`: None
520
- - `include_inputs_for_metrics`: False
521
- - `eval_do_concat_batches`: True
522
- - `fp16_backend`: auto
523
- - `push_to_hub_model_id`: None
524
- - `push_to_hub_organization`: None
525
- - `mp_parameters`:
526
- - `auto_find_batch_size`: False
527
- - `full_determinism`: False
528
- - `torchdynamo`: None
529
- - `ray_scope`: last
530
- - `ddp_timeout`: 1800
531
- - `torch_compile`: False
532
- - `torch_compile_backend`: None
533
- - `torch_compile_mode`: None
534
- - `dispatch_batches`: None
535
- - `split_batches`: None
536
- - `include_tokens_per_second`: False
537
- - `include_num_input_tokens_seen`: False
538
- - `neftune_noise_alpha`: None
539
- - `optim_target_modules`: None
540
- - `batch_eval_metrics`: False
541
- - `batch_sampler`: no_duplicates
542
- - `multi_dataset_batch_sampler`: proportional
543
-
544
- </details>
545
-
546
- ### Training Logs
547
- | Epoch | Step | Training Loss | loss | sts-dev-256_spearman_cosine | sts-dev-512_spearman_cosine | sts-dev-768_spearman_cosine |
548
- |:------:|:----:|:-------------:|:------:|:---------------------------:|:---------------------------:|:---------------------------:|
549
- | 0 | 0 | - | - | 0.5425 | 0.5569 | 0.5593 |
550
- | 0.0494 | 300 | 5.6741 | - | - | - | - |
551
- | 0.0823 | 500 | - | 2.9876 | 0.6417 | 0.6479 | 0.6502 |
552
- | 0.0988 | 600 | 3.5541 | - | - | - | - |
553
- | 0.1481 | 900 | 2.9032 | - | - | - | - |
554
- | 0.1646 | 1000 | - | 2.3400 | 0.6526 | 0.6565 | 0.6591 |
555
- | 0.1975 | 1200 | 2.6495 | - | - | - | - |
556
- | 0.2469 | 1500 | 2.426 | 2.1092 | 0.6359 | 0.6466 | 0.6501 |
557
- | 0.2963 | 1800 | 2.2969 | - | - | - | - |
558
- | 0.3292 | 2000 | - | 1.9556 | 0.6390 | 0.6491 | 0.6516 |
559
- | 0.3457 | 2100 | 2.1003 | - | - | - | - |
560
- | 0.3951 | 2400 | 2.0975 | - | - | - | - |
561
- | 0.4115 | 2500 | - | 1.8133 | 0.6585 | 0.6681 | 0.6709 |
562
- | 0.4444 | 2700 | 2.0403 | - | - | - | - |
563
- | 0.4938 | 3000 | 1.9421 | 1.7629 | 0.6415 | 0.6515 | 0.6540 |
564
- | 0.5432 | 3300 | 1.9313 | - | - | - | - |
565
- | 0.5761 | 3500 | - | 1.6924 | 0.6577 | 0.6660 | 0.6673 |
566
- | 0.5926 | 3600 | 1.8582 | - | - | - | - |
567
- | 0.6420 | 3900 | 1.8203 | - | - | - | - |
568
- | 0.6584 | 4000 | - | 1.6263 | 0.6527 | 0.6620 | 0.6635 |
569
- | 0.6914 | 4200 | 1.8281 | - | - | - | - |
570
- | 0.7407 | 4500 | 1.8037 | 1.5776 | 0.6572 | 0.6677 | 0.6685 |
571
- | 0.7901 | 4800 | 1.7771 | - | - | - | - |
572
- | 0.8230 | 5000 | - | 1.5571 | 0.6548 | 0.6652 | 0.6665 |
573
- | 0.8395 | 5100 | 1.7427 | - | - | - | - |
574
- | 0.8889 | 5400 | 1.6901 | - | - | - | - |
575
- | 0.9053 | 5500 | - | 1.5385 | 0.6604 | 0.6707 | 0.6717 |
576
- | 0.9383 | 5700 | 1.7977 | - | - | - | - |
577
- | 0.9877 | 6000 | 1.6838 | 1.5279 | 0.6576 | 0.6686 | 0.6701 |
578
-
579
-
580
- ### Framework Versions
581
- - Python: 3.10.13
582
- - Sentence Transformers: 3.0.1
583
- - Transformers: 4.41.2
584
- - PyTorch: 2.1.2
585
- - Accelerate: 0.30.1
586
- - Datasets: 2.19.2
587
- - Tokenizers: 0.19.1
588
-
589
- ## Citation
590
-
591
- ### BibTeX
592
-
593
- #### Sentence Transformers
594
- ```bibtex
595
- @inproceedings{reimers-2019-sentence-bert,
596
- title = "Sentence-BERT: Sentence Embeddings using Siamese BERT-Networks",
597
- author = "Reimers, Nils and Gurevych, Iryna",
598
- booktitle = "Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing",
599
- month = "11",
600
- year = "2019",
601
- publisher = "Association for Computational Linguistics",
602
- url = "https://arxiv.org/abs/1908.10084",
603
- }
604
- ```
605
-
606
- #### MatryoshkaLoss
607
- ```bibtex
608
- @misc{kusupati2024matryoshka,
609
- title={Matryoshka Representation Learning},
610
- author={Aditya Kusupati and Gantavya Bhatt and Aniket Rege and Matthew Wallingford and Aditya Sinha and Vivek Ramanujan and William Howard-Snyder and Kaifeng Chen and Sham Kakade and Prateek Jain and Ali Farhadi},
611
- year={2024},
612
- eprint={2205.13147},
613
- archivePrefix={arXiv},
614
- primaryClass={cs.LG}
615
- }
616
- ```
617
-
618
- #### MultipleNegativesRankingLoss
619
- ```bibtex
620
- @misc{henderson2017efficient,
621
- title={Efficient Natural Language Response Suggestion for Smart Reply},
622
- author={Matthew Henderson and Rami Al-Rfou and Brian Strope and Yun-hsuan Sung and Laszlo Lukacs and Ruiqi Guo and Sanjiv Kumar and Balint Miklos and Ray Kurzweil},
623
- year={2017},
624
- eprint={1705.00652},
625
- archivePrefix={arXiv},
626
- primaryClass={cs.CL}
627
- }
628
- ```
629
-
630
- <!--
631
- ## Glossary
632
-
633
- *Clearly define terms in order to be accessible across audiences.*
634
- -->
635
-
636
- <!--
637
- ## Model Card Authors
638
-
639
- *Lists the people who create the model card, providing recognition and accountability for the detailed work that goes into its construction.*
640
- -->
641
-
642
- <!--
643
- ## Model Card Contact
644
-
645
- *Provides a way for people who have updates to the Model Card, suggestions, or questions, to contact the Model Card authors.*
646
- -->
 
1
  ---
2
  language:
 
 
 
 
 
 
 
 
 
 
 
 
 
3
  - vi
 
4
  library_name: sentence-transformers
5
  tags:
6
  - sentence-transformers
 
187
  - **Similarity Function:** Cosine Similarity
188
  - **Training Dataset:**
189
  - [facebook/xnli](https://huggingface.co/datasets/facebook/xnli)
190
+ - **Languages:**vi
191
  <!-- - **License:** Unknown -->
192
 
193
  ### Model Sources
 
221
  from sentence_transformers import SentenceTransformer
222
 
223
  # Download from the 🤗 Hub
224
+ model = SentenceTransformer("nampham1106/bkcare-text-emb-v1.0")
225
  # Run inference
226
  sentences = [
227
  'Tôi sẽ làm tất cả những gì ông muốn. julius hạ khẩu súng lục .',
 
299
  | spearman_dot | 0.6631 |
300
  | pearson_max | 0.6851 |
301
  | spearman_max | 0.6695 |