HZeroxium
/

paraphrase-multilingual-MiniLM-L12-v2-job-cv-multi-dataset

@@ -4,41 +4,45 @@ tags:
 - sentence-similarity
 - feature-extraction
 - generated_from_trainer
-- dataset_size:2461
 - loss:ContrastiveLoss
 - loss:TripletLoss
 base_model: sentence-transformers/paraphrase-multilingual-MiniLM-L12-v2
 widget:
-- source_sentence: Kỹ sư tự động hóa, 3 năm kinh nghiệm lập trình robot công nghiệp.
   sentences:
-  - Customer Service Specialist với kỹ năng giải quyết vấn đề khách hàng.
-  - Tuyển chuyên viên marketing xây dựng chiến lược thương hiệu.
-  - Automation Engineer yêu cầu kinh nghiệm với robot công nghiệp.
-- source_sentence: Chuyên viên pháp lý, tư vấn luật hợp đồng doanh nghiệp.
   sentences:
-  - Tuyển nhân viên QA kiểm thử phần mềm tự động.
-  - Legal Consultant chuyên về hợp đồng doanh nghiệp.
-  - Tuyển Web Developer với kỹ năng lập trình web cơ bản.
-- source_sentence: Chuyên viên SEO với 3 năm kinh nghiệm tối ưu hóa công cụ tìm kiếm.
   sentences:
-  - Tuyển Data Scientist có kinh nghiệm Machine Learning.
-  - Tuyển kỹ thuật viên xét nghiệm có kinh nghiệm làm việc trong phòng thí nghiệm
-    y tế.
-  - Tuyển Mechanical Engineer có kinh nghiệm thiết kế hệ thống cơ khí.
-- source_sentence: DevOps Engineer, kinh nghiệm 4 năm sử dụng Docker, Kubernetes.
   sentences:
-  - Operation Specialist với kỹ năng cải thiện hiệu suất sản xuất.
-  - Tuyển Finance Analyst.
-  - Tuyển DevOps Engineer với kinh nghiệm containerization.
-- source_sentence: Tôi là lập trình viên Android, có kinh nghiệm với Kotlin và hệ
-    thống thanh toán.
   sentences:
-  - Tuyển Android Developer, yêu cầu kinh nghiệm tích hợp thanh toán.
-  - Tuyển dụng Mobile Developer có kinh nghiệm đa nền tảng.
-  - Tuyển Software Engineer thành thạo Java và Spring Boot.
 datasets:
-- HZeroxium/cv-job-binary
 - HZeroxium/cv-job-triplet
 pipeline_tag: sentence-similarity
 library_name: sentence-transformers
 metrics:
@@ -49,6 +53,8 @@ metrics:
 - cosine_precision
 - cosine_recall
 - cosine_ap
 model-index:
 - name: SentenceTransformer based on sentence-transformers/paraphrase-multilingual-MiniLM-L12-v2
   results:
@@ -60,25 +66,67 @@ model-index:
       type: unknown
     metrics:
     - type: cosine_accuracy
-      value: 0.9767441860465116
       name: Cosine Accuracy
     - type: cosine_accuracy_threshold
-      value: 0.7162894010543823
       name: Cosine Accuracy Threshold
     - type: cosine_f1
-      value: 0.9782608695652174
       name: Cosine F1
     - type: cosine_f1_threshold
-      value: 0.7162894010543823
       name: Cosine F1 Threshold
     - type: cosine_precision
-      value: 0.967741935483871
       name: Cosine Precision
     - type: cosine_recall
-      value: 0.989010989010989
       name: Cosine Recall
     - type: cosine_ap
-      value: 0.9802797763086614
       name: Cosine Ap
   - task:
       type: triplet
@@ -90,11 +138,24 @@ model-index:
     - type: cosine_accuracy
       value: 1.0
       name: Cosine Accuracy
 ---
 # SentenceTransformer based on sentence-transformers/paraphrase-multilingual-MiniLM-L12-v2
-This is a [sentence-transformers](https://www.SBERT.net) model finetuned from [sentence-transformers/paraphrase-multilingual-MiniLM-L12-v2](https://huggingface.co/sentence-transformers/paraphrase-multilingual-MiniLM-L12-v2) on the [binary](https://huggingface.co/datasets/HZeroxium/cv-job-binary) and [triplet](https://huggingface.co/datasets/HZeroxium/cv-job-triplet) datasets. It maps sentences & paragraphs to a 384-dimensional dense vector space and can be used for semantic textual similarity, semantic search, paraphrase mining, text classification, clustering, and more.
 ## Model Details
@@ -105,8 +166,11 @@ This is a [sentence-transformers](https://www.SBERT.net) model finetuned from [s
 - **Output Dimensionality:** 384 dimensions
 - **Similarity Function:** Cosine Similarity
 - **Training Datasets:**
-    - [binary](https://huggingface.co/datasets/HZeroxium/cv-job-binary)
     - [triplet](https://huggingface.co/datasets/HZeroxium/cv-job-triplet)
 <!-- - **Language:** Unknown -->
 <!-- - **License:** Unknown -->
@@ -140,12 +204,12 @@ Then you can load this model and run inference.
 from sentence_transformers import SentenceTransformer
 # Download from the 🤗 Hub
-model = SentenceTransformer("HZeroxium/paraphrase-multilingual-MiniLM-L12-v2-job-cv-multi-dataset")
 # Run inference
 sentences = [
-    'Tôi là lập trình viên Android, có kinh nghiệm với Kotlin và hệ thống thanh toán.',
-    'Tuyển Android Developer, yêu cầu kinh nghiệm tích hợp thanh toán.',
-    'Tuyển Software Engineer thành thạo Java và Spring Boot.',
 ]
 embeddings = model.encode(sentences)
 print(embeddings.shape)
@@ -191,13 +255,13 @@ You can finetune this model on your own dataset.
 | Metric                    | Value      |
 |:--------------------------|:-----------|
-| cosine_accuracy           | 0.9767     |
-| cosine_accuracy_threshold | 0.7163     |
-| cosine_f1                 | 0.9783     |
-| cosine_f1_threshold       | 0.7163     |
-| cosine_precision          | 0.9677     |
-| cosine_recall             | 0.989      |
-| **cosine_ap**             | **0.9803** |
 #### Triplet
@@ -207,6 +271,43 @@ You can finetune this model on your own dataset.
 |:--------------------|:--------|
 | **cosine_accuracy** | **1.0** |
 <!--
 ## Bias, Risks and Limitations
@@ -225,20 +326,20 @@ You can finetune this model on your own dataset.
 #### binary
-* Dataset: [binary](https://huggingface.co/datasets/HZeroxium/cv-job-binary) at [07e2530](https://huggingface.co/datasets/HZeroxium/cv-job-binary/tree/07e2530d65574aec0375699117d9cac8cf38986e)
-* Size: 1,543 training samples
-* Columns: <code>cv</code>, <code>job</code>, and <code>label</code>
 * Approximate statistics based on the first 1000 samples:
-  |         | cv                                                                                 | job                                                                               | label                                           |
-  |:--------|:-----------------------------------------------------------------------------------|:----------------------------------------------------------------------------------|:------------------------------------------------|
-  | type    | string                                                                             | string                                                                            | int                                             |
-  | details | <ul><li>min: 12 tokens</li><li>mean: 21.22 tokens</li><li>max: 32 tokens</li></ul> | <ul><li>min: 6 tokens</li><li>mean: 15.56 tokens</li><li>max: 26 tokens</li></ul> | <ul><li>0: ~38.20%</li><li>1: ~61.80%</li></ul> |
 * Samples:
-  | cv                                                                                       | job                                                                | label          |
-  |:-----------------------------------------------------------------------------------------|:-------------------------------------------------------------------|:---------------|
-  | <code>Giáo viên mầm non với kỹ năng giảng dạy trẻ em.</code>                             | <code>Tuyển Kindergarten Teacher có kinh nghiệm mầm non.</code>    | <code>1</code> |
-  | <code>Nhân viên kế toán với kinh nghiệm làm việc trong các doanh nghiệp sản xuất.</code> | <code>Tuyển Data Engineer có kinh nghiệm xử lý dữ liệu lớn.</code> | <code>0</code> |
-  | <code>Chuyên viên nhân sự, có kinh nghiệm quản lý và đào tạo nhân viên.</code>           | <code>Tuyển Embedded Systems Developer.</code>                     | <code>0</code> |
 * Loss: [<code>ContrastiveLoss</code>](https://sbert.net/docs/package_reference/sentence_transformer/losses.html#contrastiveloss) with these parameters:
   ```json
   {
@@ -250,20 +351,20 @@ You can finetune this model on your own dataset.
 #### triplet
-* Dataset: [triplet](https://huggingface.co/datasets/HZeroxium/cv-job-triplet) at [c8215b6](https://huggingface.co/datasets/HZeroxium/cv-job-triplet/tree/c8215b694523650ad1d37b0ee2d182978c42094d)
-* Size: 918 training samples
 * Columns: <code>anchor</code>, <code>positive</code>, and <code>negative</code>
-* Approximate statistics based on the first 918 samples:
-  |         | anchor                                                                             | positive                                                                          | negative                                                                          |
-  |:--------|:-----------------------------------------------------------------------------------|:----------------------------------------------------------------------------------|:----------------------------------------------------------------------------------|
-  | type    | string                                                                             | string                                                                            | string                                                                            |
-  | details | <ul><li>min: 12 tokens</li><li>mean: 18.79 tokens</li><li>max: 30 tokens</li></ul> | <ul><li>min: 9 tokens</li><li>mean: 15.09 tokens</li><li>max: 26 tokens</li></ul> | <ul><li>min: 8 tokens</li><li>mean: 13.68 tokens</li><li>max: 20 tokens</li></ul> |
 * Samples:
-  | anchor                                                                                 | positive                                                                            | negative                                                             |
-  |:---------------------------------------------------------------------------------------|:------------------------------------------------------------------------------------|:---------------------------------------------------------------------|
-  | <code>Graphic Designer, chuyên thiết kế logo và bộ nhận diện thương hiệu.</code>       | <code>Tuyển dụng Graphic Designer thành thạo Adobe Illustrator.</code>              | <code>Tuyển kỹ sư điện làm việc trong nhà máy sản xuất.</code>       |
-  | <code>Kỹ sư xây dựng, 5 năm kinh nghiệm thiết kế và quản lý dự án xây dựng.</code>     | <code>Tuyển dụng Construction Manager có kinh nghiệm quản lý dự án xây dựng.</code> | <code>Tuyển nhân viên bán hàng cho các sản phẩm thời trang.</code>   |
-  | <code>Software Engineer, 2 năm kinh nghiệm phát triển ứng dụng web với Node.js.</code> | <code>Tuyển dụng Backend Developer thành thạo Node.js.</code>                       | <code>Tuyển nhân viên hỗ trợ kỹ thuật trong ngành viễn thông.</code> |
 * Loss: [<code>TripletLoss</code>](https://sbert.net/docs/package_reference/sentence_transformer/losses.html#tripletloss) with these parameters:
   ```json
   {
@@ -272,24 +373,96 @@ You can finetune this model on your own dataset.
   }
   ```
 ### Evaluation Datasets
 #### binary
-* Dataset: [binary](https://huggingface.co/datasets/HZeroxium/cv-job-binary) at [07e2530](https://huggingface.co/datasets/HZeroxium/cv-job-binary/tree/07e2530d65574aec0375699117d9cac8cf38986e)
-* Size: 172 evaluation samples
-* Columns: <code>cv</code>, <code>job</code>, and <code>label</code>
-* Approximate statistics based on the first 172 samples:
-  |         | cv                                                                                 | job                                                                               | label                                           |
   |:--------|:-----------------------------------------------------------------------------------|:----------------------------------------------------------------------------------|:------------------------------------------------|
   | type    | string                                                                             | string                                                                            | int                                             |
-  | details | <ul><li>min: 16 tokens</li><li>mean: 21.28 tokens</li><li>max: 33 tokens</li></ul> | <ul><li>min: 6 tokens</li><li>mean: 15.24 tokens</li><li>max: 22 tokens</li></ul> | <ul><li>0: ~47.09%</li><li>1: ~52.91%</li></ul> |
 * Samples:
-  | cv                                                                                                       | job                                                                                  | label          |
-  |:---------------------------------------------------------------------------------------------------------|:-------------------------------------------------------------------------------------|:---------------|
-  | <code>Lập trình viên PHP, có kinh nghiệm phát triển các ứng dụng web sử dụng Laravel.</code>             | <code>Tuyển Supply Chain Manager, yêu cầu kinh nghiệm quản lý chuỗi cung ứng.</code> | <code>0</code> |
-  | <code>Tôi là nhà thiết kế thời trang, có kinh nghiệm trong thiết kế trang phục nữ.</code>                | <code>Cần tuyển kỹ sư cơ điện tử, yêu cầu kinh nghiệm lập trình PLC.</code>          | <code>0</code> |
-  | <code>Software Engineer, kinh nghiệm lập trình Python và Golang, đã triển khai hệ thống phân tán.</code> | <code>Tuyển Software Engineer có kinh nghiệm Python và Golang.</code>                | <code>1</code> |
 * Loss: [<code>ContrastiveLoss</code>](https://sbert.net/docs/package_reference/sentence_transformer/losses.html#contrastiveloss) with these parameters:
   ```json
   {
@@ -301,20 +474,20 @@ You can finetune this model on your own dataset.
 #### triplet
-* Dataset: [triplet](https://huggingface.co/datasets/HZeroxium/cv-job-triplet) at [c8215b6](https://huggingface.co/datasets/HZeroxium/cv-job-triplet/tree/c8215b694523650ad1d37b0ee2d182978c42094d)
-* Size: 102 evaluation samples
 * Columns: <code>anchor</code>, <code>positive</code>, and <code>negative</code>
-* Approximate statistics based on the first 102 samples:
-  |         | anchor                                                                             | positive                                                                           | negative                                                                          |
-  |:--------|:-----------------------------------------------------------------------------------|:-----------------------------------------------------------------------------------|:----------------------------------------------------------------------------------|
-  | type    | string                                                                             | string                                                                             | string                                                                            |
-  | details | <ul><li>min: 13 tokens</li><li>mean: 18.74 tokens</li><li>max: 26 tokens</li></ul> | <ul><li>min: 10 tokens</li><li>mean: 14.78 tokens</li><li>max: 20 tokens</li></ul> | <ul><li>min: 8 tokens</li><li>mean: 13.42 tokens</li><li>max: 18 tokens</li></ul> |
 * Samples:
-  | anchor                                                                          | positive                                                                           | negative                                                              |
-  |:--------------------------------------------------------------------------------|:-----------------------------------------------------------------------------------|:----------------------------------------------------------------------|
-  | <code>Graphic Designer, chuyên thiết kế UI/UX cho ứng dụng di động.</code>      | <code>UI/UX Designer cần kinh nghiệm trong thiết kế giao diện người dùng.</code>   | <code>Tuyển chuyên viên tài chính tư vấn đầu tư.</code>               |
-  | <code>Product Manager, 4 năm kinh nghiệm quản lý sản phẩm công nghệ.</code>     | <code>Tuyển Product Manager có kinh nghiệm phát triển sản phẩm công nghệ.</code>   | <code>Tuyển chuyên viên nhân sự quản lý đào tạo và tuyển dụng.</code> |
-  | <code>Chuyên viên quản lý tài chính, lập kế hoạch và theo dõi dòng tiền.</code> | <code>Finance Manager cần kinh nghiệm trong quản lý tài chính doanh nghiệp.</code> | <code>Tuyển chuyên viên phân tích dữ liệu y tế.</code>                |
 * Loss: [<code>TripletLoss</code>](https://sbert.net/docs/package_reference/sentence_transformer/losses.html#tripletloss) with these parameters:
   ```json
   {
@@ -323,6 +496,78 @@ You can finetune this model on your own dataset.
   }
   ```
 ### Training Hyperparameters
 #### Non-Default Hyperparameters
@@ -457,48 +702,31 @@ You can finetune this model on your own dataset.
 </details>
 ### Training Logs
-| Epoch  | Step | Training Loss | binary loss | triplet loss | cosine_ap | cosine_accuracy |
-|:------:|:----:|:-------------:|:-----------:|:------------:|:---------:|:---------------:|
-| 0      | 0    | -             | -           | -            | 0.9849    | 1.0             |
-| 0.1282 | 10   | 1.391         | -           | -            | -         | -               |
-| 0.2564 | 20   | 0.5121        | -           | -            | -         | -               |
-| 0.3846 | 30   | 0.634         | -           | -            | -         | -               |
-| 0.5128 | 40   | 0.2135        | -           | -            | -         | -               |
-| 0.6410 | 50   | 0.0371        | -           | -            | -         | -               |
-| 0.7692 | 60   | 0.0413        | -           | -            | -         | -               |
-| 0.8974 | 70   | 0.0556        | -           | -            | -         | -               |
-| 1.0256 | 80   | 0.0051        | -           | -            | -         | -               |
-| 1.1538 | 90   | 0.0301        | -           | -            | -         | -               |
-| 1.2821 | 100  | 0.0104        | 0.0049      | 0.0252       | 0.9882    | 1.0             |
-| 1.4103 | 110  | 0.0168        | -           | -            | -         | -               |
-| 1.5385 | 120  | 0.012         | -           | -            | -         | -               |
-| 1.6667 | 130  | 0.0042        | -           | -            | -         | -               |
-| 1.7949 | 140  | 0.0071        | -           | -            | -         | -               |
-| 1.9231 | 150  | 0.007         | -           | -            | -         | -               |
-| 2.0513 | 160  | 0.0022        | -           | -            | -         | -               |
-| 2.1795 | 170  | 0.0043        | -           | -            | -         | -               |
-| 2.3077 | 180  | 0.0025        | -           | -            | -         | -               |
-| 2.4359 | 190  | 0.0038        | -           | -            | -         | -               |
-| 2.5641 | 200  | 0.006         | 0.0043      | 0.0142       | 0.9761    | 1.0             |
-| 2.6923 | 210  | 0.002         | -           | -            | -         | -               |
-| 2.8205 | 220  | 0.0043        | -           | -            | -         | -               |
-| 2.9487 | 230  | 0.003         | -           | -            | -         | -               |
-| 3.0769 | 240  | 0.0019        | -           | -            | -         | -               |
-| 3.2051 | 250  | 0.0024        | -           | -            | -         | -               |
-| 3.3333 | 260  | 0.002         | -           | -            | -         | -               |
-| 3.4615 | 270  | 0.0025        | -           | -            | -         | -               |
-| 3.5897 | 280  | 0.0022        | -           | -            | -         | -               |
-| 3.7179 | 290  | 0.0021        | -           | -            | -         | -               |
-| 3.8462 | 300  | 0.0017        | 0.0037      | 0.0162       | 0.9803    | 1.0             |
-| 3.9744 | 310  | 0.0023        | -           | -            | -         | -               |
-| 4.1026 | 320  | 0.0017        | -           | -            | -         | -               |
-| 4.2308 | 330  | 0.002         | -           | -            | -         | -               |
-| 4.3590 | 340  | 0.0022        | -           | -            | -         | -               |
-| 4.4872 | 350  | 0.0015        | -           | -            | -         | -               |
-| 4.6154 | 360  | 0.0018        | -           | -            | -         | -               |
-| 4.7436 | 370  | 0.0021        | -           | -            | -         | -               |
-| 4.8718 | 380  | 0.0014        | -           | -            | -         | -               |
-| 5.0    | 390  | 0.0022        | -           | -            | 0.9803    | 1.0             |
 ### Framework Versions
@@ -553,6 +781,29 @@ You can finetune this model on your own dataset.
 }
 ```
 <!--
 ## Glossary

 - sentence-similarity
 - feature-extraction
 - generated_from_trainer
+- dataset_size:22654
 - loss:ContrastiveLoss
 - loss:TripletLoss
+- loss:CoSENTLoss
+- loss:MultipleNegativesRankingLoss
 base_model: sentence-transformers/paraphrase-multilingual-MiniLM-L12-v2
 widget:
+- source_sentence: Network Operations Specialist yêu cầu tối ưu hóa mạng.
   sentences:
+  - Actor cần có kỹ năng biểu diễn sân khấu và hóa thân vào nhiều loại nhân vật.
+  - Network Operations Specialist cần tối ưu hóa mạng.
+  - Nhà tư vấn PR hỗ trợ doanh nghiệp trong việc phát triển hình ảnh công chúng và
+    xử lý khủng hoảng.
+- source_sentence: Cybersecurity Specialist với kinh nghiệm bảo mật hệ thống 5 năm.
   sentences:
+  - Kỹ sư cơ khí cần phát triển hệ thống sản xuất tự động hóa.
+  - Cybersecurity Engineer, yêu cầu tối thiểu 5 năm trong bảo mật.
+  - Data Scientist cần kỹ năng Machine Learning và Python.
+- source_sentence: Tư vấn môi trường hỗ trợ kiểm soát ô nhiễm môi trường đô thị.
   sentences:
+  - Quản lý chất thải có kinh nghiệm xử lý và tái chế nước.
+  - Tư vấn môi trường quản lý chất lượng môi trường đô thị.
+  - Illustrator cần có khả năng minh họa cho sách giáo dục và tài liệu học tập.
+- source_sentence: Mobile Developer với kinh nghiệm phát triển ứng dụng iOS và Swift.
   sentences:
+  - Tuyển iOS Developer có kỹ năng làm việc với Swift.
+  - Tuyển chuyên viên QA kiểm tra chất lượng phần mềm.
+  - Mobile Developer cần biết phát triển ứng dụng đa nền tảng.
+- source_sentence: Mobile Developer, kinh nghiệm lập trình ứng dụng iOS với Swift.
   sentences:
+  - Tuyển kỹ sư cơ khí giám sát dây chuyền sản xuất.
+  - Công ty XYZ tuyển Data Scientist với tối thiểu 2 năm kinh nghiệm học máy.
+  - Tuyển iOS Developer thành thạo Swift.
 datasets:
+- HZeroxium/job-cv-binary
 - HZeroxium/cv-job-triplet
+- HZeroxium/cv-job-similarity
+- HZeroxium/job-paraphrase
+- HZeroxium/cv-paraphrase
 pipeline_tag: sentence-similarity
 library_name: sentence-transformers
 metrics:
 - cosine_precision
 - cosine_recall
 - cosine_ap
+- pearson_cosine
+- spearman_cosine
 model-index:
 - name: SentenceTransformer based on sentence-transformers/paraphrase-multilingual-MiniLM-L12-v2
   results:
       type: unknown
     metrics:
     - type: cosine_accuracy
+      value: 0.9755351681957186
       name: Cosine Accuracy
     - type: cosine_accuracy_threshold
+      value: 0.5808850526809692
       name: Cosine Accuracy Threshold
     - type: cosine_f1
+      value: 0.9779005524861878
       name: Cosine F1
     - type: cosine_f1_threshold
+      value: 0.5644330978393555
       name: Cosine F1 Threshold
     - type: cosine_precision
+      value: 0.9833333333333333
       name: Cosine Precision
     - type: cosine_recall
+      value: 0.9725274725274725
       name: Cosine Recall
     - type: cosine_ap
+      value: 0.9956042554162885
+      name: Cosine Ap
+    - type: cosine_accuracy
+      value: 0.9968051118210862
+      name: Cosine Accuracy
+    - type: cosine_accuracy_threshold
+      value: 0.7650139331817627
+      name: Cosine Accuracy Threshold
+    - type: cosine_f1
+      value: 0.9984
+      name: Cosine F1
+    - type: cosine_f1_threshold
+      value: 0.7650139331817627
+      name: Cosine F1 Threshold
+    - type: cosine_precision
+      value: 1.0
+      name: Cosine Precision
+    - type: cosine_recall
+      value: 0.9968051118210862
+      name: Cosine Recall
+    - type: cosine_ap
+      value: 0.9999999999999999
+      name: Cosine Ap
+    - type: cosine_accuracy
+      value: 0.9936305732484076
+      name: Cosine Accuracy
+    - type: cosine_accuracy_threshold
+      value: 0.8211346864700317
+      name: Cosine Accuracy Threshold
+    - type: cosine_f1
+      value: 0.9968051118210862
+      name: Cosine F1
+    - type: cosine_f1_threshold
+      value: 0.8211346864700317
+      name: Cosine F1 Threshold
+    - type: cosine_precision
+      value: 1.0
+      name: Cosine Precision
+    - type: cosine_recall
+      value: 0.9936305732484076
+      name: Cosine Recall
+    - type: cosine_ap
+      value: 1.0
       name: Cosine Ap
   - task:
       type: triplet
     - type: cosine_accuracy
       value: 1.0
       name: Cosine Accuracy
+  - task:
+      type: semantic-similarity
+      name: Semantic Similarity
+    dataset:
+      name: Unknown
+      type: unknown
+    metrics:
+    - type: pearson_cosine
+      value: 0.970012297655986
+      name: Pearson Cosine
+    - type: spearman_cosine
+      value: 0.9430534588122865
+      name: Spearman Cosine
 ---
 # SentenceTransformer based on sentence-transformers/paraphrase-multilingual-MiniLM-L12-v2
+This is a [sentence-transformers](https://www.SBERT.net) model finetuned from [sentence-transformers/paraphrase-multilingual-MiniLM-L12-v2](https://huggingface.co/sentence-transformers/paraphrase-multilingual-MiniLM-L12-v2) on the [binary](https://huggingface.co/datasets/HZeroxium/job-cv-binary), [triplet](https://huggingface.co/datasets/HZeroxium/cv-job-triplet), [similarity](https://huggingface.co/datasets/HZeroxium/cv-job-similarity), [job_paraphrase](https://huggingface.co/datasets/HZeroxium/job-paraphrase) and [cv_paraphrase](https://huggingface.co/datasets/HZeroxium/cv-paraphrase) datasets. It maps sentences & paragraphs to a 384-dimensional dense vector space and can be used for semantic textual similarity, semantic search, paraphrase mining, text classification, clustering, and more.
 ## Model Details
 - **Output Dimensionality:** 384 dimensions
 - **Similarity Function:** Cosine Similarity
 - **Training Datasets:**
+    - [binary](https://huggingface.co/datasets/HZeroxium/job-cv-binary)
     - [triplet](https://huggingface.co/datasets/HZeroxium/cv-job-triplet)
+    - [similarity](https://huggingface.co/datasets/HZeroxium/cv-job-similarity)
+    - [job_paraphrase](https://huggingface.co/datasets/HZeroxium/job-paraphrase)
+    - [cv_paraphrase](https://huggingface.co/datasets/HZeroxium/cv-paraphrase)
 <!-- - **Language:** Unknown -->
 <!-- - **License:** Unknown -->
 from sentence_transformers import SentenceTransformer
 # Download from the 🤗 Hub
+model = SentenceTransformer("paraphrase-multilingual-MiniLM-L12-v2-job-cv-multi-dataset")
 # Run inference
 sentences = [
+    'Mobile Developer, kinh nghiệm lập trình ứng dụng iOS với Swift.',
+    'Tuyển iOS Developer thành thạo Swift.',
+    'Tuyển kỹ sư cơ khí giám sát dây chuyền sản xuất.',
 ]
 embeddings = model.encode(sentences)
 print(embeddings.shape)
 | Metric                    | Value      |
 |:--------------------------|:-----------|
+| cosine_accuracy           | 0.9755     |
+| cosine_accuracy_threshold | 0.5809     |
+| cosine_f1                 | 0.9779     |
+| cosine_f1_threshold       | 0.5644     |
+| cosine_precision          | 0.9833     |
+| cosine_recall             | 0.9725     |
+| **cosine_ap**             | **0.9956** |
 #### Triplet
 |:--------------------|:--------|
 | **cosine_accuracy** | **1.0** |
+#### Semantic Similarity
+* Evaluated with [<code>EmbeddingSimilarityEvaluator</code>](https://sbert.net/docs/package_reference/sentence_transformer/evaluation.html#sentence_transformers.evaluation.EmbeddingSimilarityEvaluator)
+| Metric              | Value      |
+|:--------------------|:-----------|
+| pearson_cosine      | 0.97       |
+| **spearman_cosine** | **0.9431** |
+#### Binary Classification
+* Evaluated with [<code>BinaryClassificationEvaluator</code>](https://sbert.net/docs/package_reference/sentence_transformer/evaluation.html#sentence_transformers.evaluation.BinaryClassificationEvaluator)
+| Metric                    | Value   |
+|:--------------------------|:--------|
+| cosine_accuracy           | 0.9968  |
+| cosine_accuracy_threshold | 0.765   |
+| cosine_f1                 | 0.9984  |
+| cosine_f1_threshold       | 0.765   |
+| cosine_precision          | 1.0     |
+| cosine_recall             | 0.9968  |
+| **cosine_ap**             | **1.0** |
+#### Binary Classification
+* Evaluated with [<code>BinaryClassificationEvaluator</code>](https://sbert.net/docs/package_reference/sentence_transformer/evaluation.html#sentence_transformers.evaluation.BinaryClassificationEvaluator)
+| Metric                    | Value   |
+|:--------------------------|:--------|
+| cosine_accuracy           | 0.9936  |
+| cosine_accuracy_threshold | 0.8211  |
+| cosine_f1                 | 0.9968  |
+| cosine_f1_threshold       | 0.8211  |
+| cosine_precision          | 1.0     |
+| cosine_recall             | 0.9936  |
+| **cosine_ap**             | **1.0** |
 <!--
 ## Bias, Risks and Limitations
 #### binary
+* Dataset: [binary](https://huggingface.co/datasets/HZeroxium/job-cv-binary) at [8c79343](https://huggingface.co/datasets/HZeroxium/job-cv-binary/tree/8c79343a3f789fc136bd857209d4b45c498f2ead)
+* Size: 6,197 training samples
+* Columns: <code>text1</code>, <code>text2</code>, and <code>label</code>
 * Approximate statistics based on the first 1000 samples:
+  |         | text1                                                                             | text2                                                                             | label                                           |
+  |:--------|:----------------------------------------------------------------------------------|:----------------------------------------------------------------------------------|:------------------------------------------------|
+  | type    | string                                                                            | string                                                                            | int                                             |
+  | details | <ul><li>min: 10 tokens</li><li>mean: 19.5 tokens</li><li>max: 38 tokens</li></ul> | <ul><li>min: 6 tokens</li><li>mean: 15.91 tokens</li><li>max: 27 tokens</li></ul> | <ul><li>0: ~43.70%</li><li>1: ~56.30%</li></ul> |
 * Samples:
+  | text1                                                                                        | text2                                                                   | label          |
+  |:---------------------------------------------------------------------------------------------|:------------------------------------------------------------------------|:---------------|
+  | <code>Lập trình viên backend, 3 năm kinh nghiệm với Node.js và xây dựng API.</code>          | <code>Tuyển Backend Developer có kinh nghiệm với Node.js.</code>        | <code>1</code> |
+  | <code>Kỹ sư mạng với 6 năm kinh nghiệm quản lý hệ thống mạng lớn.</code>                     | <code>Cần System Administrator với kinh nghiệm quản lý hệ thống.</code> | <code>0</code> |
+  | <code>Lập trình viên JavaScript với 4 năm kinh nghiệm, thành thạo Node.js và Express.</code> | <code>Cần tuyển Backend Developer biết sử dụng PHP và Laravel.</code>   | <code>0</code> |
 * Loss: [<code>ContrastiveLoss</code>](https://sbert.net/docs/package_reference/sentence_transformer/losses.html#contrastiveloss) with these parameters:
   ```json
   {
 #### triplet
+* Dataset: [triplet](https://huggingface.co/datasets/HZeroxium/cv-job-triplet) at [3100410](https://huggingface.co/datasets/HZeroxium/cv-job-triplet/tree/31004104be298c5f2f1648d8234391e7a5f7d9c0)
+* Size: 2,981 training samples
 * Columns: <code>anchor</code>, <code>positive</code>, and <code>negative</code>
+* Approximate statistics based on the first 1000 samples:
+  |         | anchor                                                                             | positive                                                                           | negative                                                                          |
+  |:--------|:-----------------------------------------------------------------------------------|:-----------------------------------------------------------------------------------|:----------------------------------------------------------------------------------|
+  | type    | string                                                                             | string                                                                             | string                                                                            |
+  | details | <ul><li>min: 10 tokens</li><li>mean: 19.51 tokens</li><li>max: 36 tokens</li></ul> | <ul><li>min: 10 tokens</li><li>mean: 15.88 tokens</li><li>max: 25 tokens</li></ul> | <ul><li>min: 6 tokens</li><li>mean: 14.47 tokens</li><li>max: 22 tokens</li></ul> |
 * Samples:
+  | anchor                                                                                       | positive                                                                            | negative                                                                       |
+  |:---------------------------------------------------------------------------------------------|:------------------------------------------------------------------------------------|:-------------------------------------------------------------------------------|
+  | <code>Account Manager, chuyên quản lý khách hàng B2B và xây dựng mối quan hệ lâu dài.</code> | <code>Tuyển Account Manager có kinh nghiệm quản lý khách hàng doanh nghiệp.</code>  | <code>Tuyển chuyên viên pháp lý tư vấn doanh nghiệp.</code>                    |
+  | <code>Chuyên viên tư vấn giáo dục với 10 năm kinh nghiệm định hướng nghề nghiệp.</code>      | <code>Cần chuyên viên tư vấn giáo dục có kinh nghiệm định hướng nghề nghiệp.</code> | <code>Nhân viên tổ chức sự kiện giáo dục hỗ trợ triển khai hội thảo.</code>    |
+  | <code>Actor với nhiều năm kinh nghiệm diễn xuất trên sân khấu và phim truyền hình.</code>    | <code>Diễn viên cần có khả năng hóa thân vào các vai diễn phức tạp.</code>          | <code>Nhà sản xuất phim cần quản lý và tổ chức các dự án phim tài liệu.</code> |
 * Loss: [<code>TripletLoss</code>](https://sbert.net/docs/package_reference/sentence_transformer/losses.html#tripletloss) with these parameters:
   ```json
   {
   }
   ```
+#### similarity
+* Dataset: [similarity](https://huggingface.co/datasets/HZeroxium/cv-job-similarity) at [c810681](https://huggingface.co/datasets/HZeroxium/cv-job-similarity/tree/c8106811dc1709bb834a1b59e3cb46f5ab75dfd9)
+* Size: 4,568 training samples
+* Columns: <code>text1</code>, <code>text2</code>, and <code>score</code>
+* Approximate statistics based on the first 1000 samples:
+  |         | text1                                                                              | text2                                                                             | score                                                            |
+  |:--------|:-----------------------------------------------------------------------------------|:----------------------------------------------------------------------------------|:-----------------------------------------------------------------|
+  | type    | string                                                                             | string                                                                            | float                                                            |
+  | details | <ul><li>min: 10 tokens</li><li>mean: 18.86 tokens</li><li>max: 38 tokens</li></ul> | <ul><li>min: 7 tokens</li><li>mean: 16.12 tokens</li><li>max: 27 tokens</li></ul> | <ul><li>min: 0.19</li><li>mean: 0.68</li><li>max: 0.96</li></ul> |
+* Samples:
+  | text1                                                                                                | text2                                                                            | score             |
+  |:-----------------------------------------------------------------------------------------------------|:---------------------------------------------------------------------------------|:------------------|
+  | <code>Hardware Engineer có khả năng thiết kế hệ thống nhúng.</code>                                  | <code>Embedded Engineer cần có kỹ năng phát triển phần mềm nhúng.</code>         | <code>0.74</code> |
+  | <code>Kỹ sư phần mềm, chuyên môn trong phát triển hệ thống thời gian thực, 4 năm kinh nghiệm.</code> | <code>Yêu cầu Embedded Software Engineer với kinh nghiệm tối thiểu 3 năm.</code> | <code>0.88</code> |
+  | <code>Cần Software Engineer với kinh nghiệm phát triển web.</code>                                   | <code>Frontend Developer cần thành thạo React và JavaScript.</code>              | <code>0.34</code> |
+* Loss: [<code>CoSENTLoss</code>](https://sbert.net/docs/package_reference/sentence_transformer/losses.html#cosentloss) with these parameters:
+  ```json
+  {
+      "scale": 20.0,
+      "similarity_fct": "pairwise_cos_sim"
+  }
+  ```
+#### job_paraphrase
+* Dataset: [job_paraphrase](https://huggingface.co/datasets/HZeroxium/job-paraphrase) at [6872029](https://huggingface.co/datasets/HZeroxium/job-paraphrase/tree/68720291bb9f628792d2f28d4653f03f6de5ef42)
+* Size: 5,939 training samples
+* Columns: <code>text1</code> and <code>text2</code>
+* Approximate statistics based on the first 1000 samples:
+  |         | text1                                                                             | text2                                                                             |
+  |:--------|:----------------------------------------------------------------------------------|:----------------------------------------------------------------------------------|
+  | type    | string                                                                            | string                                                                            |
+  | details | <ul><li>min: 6 tokens</li><li>mean: 16.25 tokens</li><li>max: 25 tokens</li></ul> | <ul><li>min: 6 tokens</li><li>mean: 15.78 tokens</li><li>max: 25 tokens</li></ul> |
+* Samples:
+  | text1                                                                         | text2                                                                             |
+  |:------------------------------------------------------------------------------|:----------------------------------------------------------------------------------|
+  | <code>Nhân viên hỗ trợ kho thuốc cần kỹ năng quản lý.</code>                  | <code>Nhân viên kho thuốc cần kỹ năng kiểm kê.</code>                             |
+  | <code>Nhân viên bán hàng cần có kỹ năng giao tiếp và xử lý tình huống.</code> | <code>Salesperson chuyên xử lý đơn hàng và giữ mối quan hệ với khách hàng.</code> |
+  | <code>Tuyển kỹ sư cơ khí chuyên thiết kế máy móc công nghiệp.</code>          | <code>Kỹ sư cơ khí cần thiết kế hệ thống sản xuất tiên tiến.</code>               |
+* Loss: [<code>MultipleNegativesRankingLoss</code>](https://sbert.net/docs/package_reference/sentence_transformer/losses.html#multiplenegativesrankingloss) with these parameters:
+  ```json
+  {
+      "scale": 20.0,
+      "similarity_fct": "cos_sim"
+  }
+  ```
+#### cv_paraphrase
+* Dataset: [cv_paraphrase](https://huggingface.co/datasets/HZeroxium/cv-paraphrase) at [22ce02f](https://huggingface.co/datasets/HZeroxium/cv-paraphrase/tree/22ce02ff309bc91193b3fa9c14a51fb3481a5fc2)
+* Size: 2,969 training samples
+* Columns: <code>text1</code> and <code>text2</code>
+* Approximate statistics based on the first 1000 samples:
+  |         | text1                                                                             | text2                                                                              |
+  |:--------|:----------------------------------------------------------------------------------|:-----------------------------------------------------------------------------------|
+  | type    | string                                                                            | string                                                                             |
+  | details | <ul><li>min: 10 tokens</li><li>mean: 20.6 tokens</li><li>max: 38 tokens</li></ul> | <ul><li>min: 10 tokens</li><li>mean: 19.52 tokens</li><li>max: 32 tokens</li></ul> |
+* Samples:
+  | text1                                                                                                             | text2                                                                                 |
+  |:------------------------------------------------------------------------------------------------------------------|:--------------------------------------------------------------------------------------|
+  | <code>Chuyên viên quản lý danh mục đầu tư với 8 năm kinh nghiệm tối ưu hóa tài sản và phân tích lợi nhuận.</code> | <code>8 năm kinh nghiệm quản lý danh mục đầu tư và phân tích tài chính.</code>        |
+  | <code>Hotel Manager with strong leadership skills and 5 years of experience.</code>                               | <code>Hotel manager skilled in optimizing hotel operations and guest services.</code> |
+  | <code>7 năm kinh nghiệm phát triển backend và cơ sở dữ liệu.</code>                                               | <code>Backend Developer chuyên về API và cơ sở dữ liệu.</code>                        |
+* Loss: [<code>MultipleNegativesRankingLoss</code>](https://sbert.net/docs/package_reference/sentence_transformer/losses.html#multiplenegativesrankingloss) with these parameters:
+  ```json
+  {
+      "scale": 20.0,
+      "similarity_fct": "cos_sim"
+  }
+  ```
 ### Evaluation Datasets
 #### binary
+* Dataset: [binary](https://huggingface.co/datasets/HZeroxium/job-cv-binary) at [8c79343](https://huggingface.co/datasets/HZeroxium/job-cv-binary/tree/8c79343a3f789fc136bd857209d4b45c498f2ead)
+* Size: 327 evaluation samples
+* Columns: <code>text1</code>, <code>text2</code>, and <code>label</code>
+* Approximate statistics based on the first 327 samples:
+  |         | text1                                                                              | text2                                                                             | label                                           |
   |:--------|:-----------------------------------------------------------------------------------|:----------------------------------------------------------------------------------|:------------------------------------------------|
   | type    | string                                                                             | string                                                                            | int                                             |
+  | details | <ul><li>min: 11 tokens</li><li>mean: 19.36 tokens</li><li>max: 31 tokens</li></ul> | <ul><li>min: 7 tokens</li><li>mean: 16.01 tokens</li><li>max: 26 tokens</li></ul> | <ul><li>0: ~44.34%</li><li>1: ~55.66%</li></ul> |
 * Samples:
+  | text1                                                                      | text2                                                                 | label          |
+  |:---------------------------------------------------------------------------|:----------------------------------------------------------------------|:---------------|
+  | <code>Tuyển kỹ sư phần mềm nhúng có kinh nghiệm 3 năm trở lên.</code>      | <code>Software Developer, yêu cầu hiểu biết về hệ thống nhúng.</code> | <code>0</code> |
+  | <code>Tư vấn môi trường hỗ trợ kiểm soát ô nhiễm môi trường đô thị.</code> | <code>Quản lý chất thải có kinh nghiệm xử lý và tái chế nước.</code>  | <code>1</code> |
+  | <code>DevOps Engineer với khả năng triển khai trên AWS, Azure.</code>      | <code>Cloud Engineer cần quản lý hạ tầng.</code>                      | <code>1</code> |
 * Loss: [<code>ContrastiveLoss</code>](https://sbert.net/docs/package_reference/sentence_transformer/losses.html#contrastiveloss) with these parameters:
   ```json
   {
 #### triplet
+* Dataset: [triplet](https://huggingface.co/datasets/HZeroxium/cv-job-triplet) at [3100410](https://huggingface.co/datasets/HZeroxium/cv-job-triplet/tree/31004104be298c5f2f1648d8234391e7a5f7d9c0)
+* Size: 157 evaluation samples
 * Columns: <code>anchor</code>, <code>positive</code>, and <code>negative</code>
+* Approximate statistics based on the first 157 samples:
+  |         | anchor                                                                            | positive                                                                           | negative                                                                          |
+  |:--------|:----------------------------------------------------------------------------------|:-----------------------------------------------------------------------------------|:----------------------------------------------------------------------------------|
+  | type    | string                                                                            | string                                                                             | string                                                                            |
+  | details | <ul><li>min: 13 tokens</li><li>mean: 19.6 tokens</li><li>max: 32 tokens</li></ul> | <ul><li>min: 10 tokens</li><li>mean: 15.66 tokens</li><li>max: 23 tokens</li></ul> | <ul><li>min: 7 tokens</li><li>mean: 14.06 tokens</li><li>max: 20 tokens</li></ul> |
 * Samples:
+  | anchor                                                                                       | positive                                                                 | negative                                                        |
+  |:---------------------------------------------------------------------------------------------|:-------------------------------------------------------------------------|:----------------------------------------------------------------|
+  | <code>Quản lý danh mục đầu tư tài chính trong hơn 6 năm, chuyên gia phân tích đầu tư.</code> | <code>Investment Analyst cần kinh nghiệm quản lý danh mục đầu tư.</code> | <code>Kế toán chi phí phụ trách kiểm soát chi phí.</code>       |
+  | <code>Chuyên viên quản lý chuỗi cung ứng, thành thạo SAP và tối ưu hóa quy trình.</code>     | <code>Supply Chain Manager có kinh nghiệm tối ưu chuỗi cung ứng.</code>  | <code>Tuyển lập trình viên Unity phát triển trò chơi 3D.</code> |
+  | <code>Nhà phân tích dữ liệu, kinh nghiệm trong lĩnh vực y tế và sinh học.</code>             | <code>Data Analyst cần kỹ năng phân tích dữ liệu y tế.</code>            | <code>Tuyển nhân viên kinh doanh bất động sản.</code>           |
 * Loss: [<code>TripletLoss</code>](https://sbert.net/docs/package_reference/sentence_transformer/losses.html#tripletloss) with these parameters:
   ```json
   {
   }
   ```
+#### similarity
+* Dataset: [similarity](https://huggingface.co/datasets/HZeroxium/cv-job-similarity) at [c810681](https://huggingface.co/datasets/HZeroxium/cv-job-similarity/tree/c8106811dc1709bb834a1b59e3cb46f5ab75dfd9)
+* Size: 241 evaluation samples
+* Columns: <code>text1</code>, <code>text2</code>, and <code>score</code>
+* Approximate statistics based on the first 241 samples:
+  |         | text1                                                                              | text2                                                                             | score                                                           |
+  |:--------|:-----------------------------------------------------------------------------------|:----------------------------------------------------------------------------------|:----------------------------------------------------------------|
+  | type    | string                                                                             | string                                                                            | float                                                           |
+  | details | <ul><li>min: 11 tokens</li><li>mean: 18.69 tokens</li><li>max: 28 tokens</li></ul> | <ul><li>min: 8 tokens</li><li>mean: 15.93 tokens</li><li>max: 23 tokens</li></ul> | <ul><li>min: 0.2</li><li>mean: 0.67</li><li>max: 0.95</li></ul> |
+* Samples:
+  | text1                                                                                    | text2                                                                                 | score             |
+  |:-----------------------------------------------------------------------------------------|:--------------------------------------------------------------------------------------|:------------------|
+  | <code>Cần Quản lý đội xe có khả năng giám sát hiệu suất và lập kế hoạch vận hành.</code> | <code>Điều phối viên vận tải yêu cầu giám sát và tối ưu hóa hoạt động vận tải.</code> | <code>0.83</code> |
+  | <code>Lập trình viên Python với kỹ năng xây dựng và tối ưu hóa hệ thống backend.</code>  | <code>Hỗ trợ kỹ thuật viên IT xử lý lỗi mạng.</code>                                  | <code>0.29</code> |
+  | <code>Nhà khoa học nghiên cứu các hệ thống nano tiên tiến cho y học hiện đại.</code>     | <code>Kỹ thuật viên thí nghiệm tập trung vào phân tích vật liệu nano.</code>          | <code>0.74</code> |
+* Loss: [<code>CoSENTLoss</code>](https://sbert.net/docs/package_reference/sentence_transformer/losses.html#cosentloss) with these parameters:
+  ```json
+  {
+      "scale": 20.0,
+      "similarity_fct": "pairwise_cos_sim"
+  }
+  ```
+#### job_paraphrase
+* Dataset: [job_paraphrase](https://huggingface.co/datasets/HZeroxium/job-paraphrase) at [6872029](https://huggingface.co/datasets/HZeroxium/job-paraphrase/tree/68720291bb9f628792d2f28d4653f03f6de5ef42)
+* Size: 313 evaluation samples
+* Columns: <code>text1</code> and <code>text2</code>
+* Approximate statistics based on the first 313 samples:
+  |         | text1                                                                              | text2                                                                              |
+  |:--------|:-----------------------------------------------------------------------------------|:-----------------------------------------------------------------------------------|
+  | type    | string                                                                             | string                                                                             |
+  | details | <ul><li>min: 10 tokens</li><li>mean: 16.32 tokens</li><li>max: 23 tokens</li></ul> | <ul><li>min: 10 tokens</li><li>mean: 15.74 tokens</li><li>max: 25 tokens</li></ul> |
+* Samples:
+  | text1                                                                            | text2                                                                                |
+  |:---------------------------------------------------------------------------------|:-------------------------------------------------------------------------------------|
+  | <code>Restaurant Manager chịu trách nhiệm giám sát và tối ưu hóa dịch vụ.</code> | <code>Restaurant Manager có khả năng điều hành và phát triển dịch vụ ăn uống.</code> |
+  | <code>Quản lý thương mại điện tử tối ưu hóa quy trình bán hàng.</code>           | <code>Quản lý sàn thương mại điện tử cần tối ưu hóa vận hành.</code>                 |
+  | <code>Kỹ thuật viên kiểm tra cần kiểm tra chất lượng hệ thống sản xuất.</code>   | <code>Kỹ thuật viên kiểm tra yêu cầu giám sát quy trình sản xuất.</code>             |
+* Loss: [<code>MultipleNegativesRankingLoss</code>](https://sbert.net/docs/package_reference/sentence_transformer/losses.html#multiplenegativesrankingloss) with these parameters:
+  ```json
+  {
+      "scale": 20.0,
+      "similarity_fct": "cos_sim"
+  }
+  ```
+#### cv_paraphrase
+* Dataset: [cv_paraphrase](https://huggingface.co/datasets/HZeroxium/cv-paraphrase) at [22ce02f](https://huggingface.co/datasets/HZeroxium/cv-paraphrase/tree/22ce02ff309bc91193b3fa9c14a51fb3481a5fc2)
+* Size: 157 evaluation samples
+* Columns: <code>text1</code> and <code>text2</code>
+* Approximate statistics based on the first 157 samples:
+  |         | text1                                                                              | text2                                                                              |
+  |:--------|:-----------------------------------------------------------------------------------|:-----------------------------------------------------------------------------------|
+  | type    | string                                                                             | string                                                                             |
+  | details | <ul><li>min: 12 tokens</li><li>mean: 20.28 tokens</li><li>max: 32 tokens</li></ul> | <ul><li>min: 13 tokens</li><li>mean: 19.34 tokens</li><li>max: 28 tokens</li></ul> |
+* Samples:
+  | text1                                                                                                | text2                                                                                                      |
+  |:-----------------------------------------------------------------------------------------------------|:-----------------------------------------------------------------------------------------------------------|
+  | <code>Producer với kinh nghiệm quản lý các dự án phim truyền hình và phim tài liệu.</code>           | <code>Chuyên gia sản xuất phim với kỹ năng quản lý các dự án phim lớn.</code>                              |
+  | <code>Chuyên viên xử lý môi trường có kinh nghiệm trong xử lý nước thải và kiểm soát ô nhiễm.</code> | <code>Chuyên gia tư vấn môi trường với kinh nghiệm phát triển các dự án tái chế và xử lý nước thải.</code> |
+  | <code>Cybersecurity Expert, chuyên gia bảo mật với 3 năm kinh nghiệm.</code>                         | <code>Chuyên gia An ninh mạng, 3 năm kinh nghiệm bảo mật hệ thống.</code>                                  |
+* Loss: [<code>MultipleNegativesRankingLoss</code>](https://sbert.net/docs/package_reference/sentence_transformer/losses.html#multiplenegativesrankingloss) with these parameters:
+  ```json
+  {
+      "scale": 20.0,
+      "similarity_fct": "cos_sim"
+  }
+  ```
 ### Training Hyperparameters
 #### Non-Default Hyperparameters
 </details>
 ### Training Logs
+| Epoch  | Step | Training Loss | binary loss | triplet loss | similarity loss | job paraphrase loss | cv paraphrase loss | cosine_ap | cosine_accuracy | spearman_cosine |
+|:------:|:----:|:-------------:|:-----------:|:------------:|:---------------:|:-------------------:|:------------------:|:---------:|:---------------:|:---------------:|
+| 0      | 0    | -             | -           | -            | -               | -                   | -                  | 1.0       | 0.9682          | 0.5468          |
+| 0.2817 | 200  | 2.401         | -           | -            | -               | -                   | -                  | -         | -               | -               |
+| 0.5634 | 400  | 1.5659        | -           | -            | -               | -                   | -                  | -         | -               | -               |
+| 0.7042 | 500  | -             | 0.0088      | 0.2391       | 6.9067          | 0.1746              | 0.2689             | 1.0       | 0.9936          | 0.9123          |
+| 0.8451 | 600  | 1.8501        | -           | -            | -               | -                   | -                  | -         | -               | -               |
+| 1.1268 | 800  | 1.7318        | -           | -            | -               | -                   | -                  | -         | -               | -               |
+| 1.4085 | 1000 | 1.3758        | 0.0079      | 0.0367       | 6.2019          | 0.1665              | 0.2657             | 1.0       | 1.0             | 0.9238          |
+| 1.6901 | 1200 | 1.3554        | -           | -            | -               | -                   | -                  | -         | -               | -               |
+| 1.9718 | 1400 | 1.5119        | -           | -            | -               | -                   | -                  | -         | -               | -               |
+| 2.1127 | 1500 | -             | 0.0081      | 0.0144       | 5.7135          | 0.1633              | 0.2295             | 1.0       | 1.0             | 0.9341          |
+| 2.2535 | 1600 | 1.2886        | -           | -            | -               | -                   | -                  | -         | -               | -               |
+| 2.5352 | 1800 | 1.1131        | -           | -            | -               | -                   | -                  | -         | -               | -               |
+| 2.8169 | 2000 | 1.3962        | 0.0108      | 0.0191       | 6.0231          | 0.1540              | 0.2342             | 1.0       | 1.0             | 0.9396          |
+| 3.0986 | 2200 | 1.2394        | -           | -            | -               | -                   | -                  | -         | -               | -               |
+| 3.3803 | 2400 | 1.1392        | -           | -            | -               | -                   | -                  | -         | -               | -               |
+| 3.5211 | 2500 | -             | 0.0097      | 0.0025       | 5.6361          | 0.1580              | 0.2212             | 1.0       | 1.0             | 0.9410          |
+| 3.6620 | 2600 | 1.1614        | -           | -            | -               | -                   | -                  | -         | -               | -               |
+| 3.9437 | 2800 | 1.2351        | -           | -            | -               | -                   | -                  | -         | -               | -               |
+| 4.2254 | 3000 | 1.1862        | 0.0100      | 0.0107       | 5.5943          | 0.1517              | 0.2158             | 1.0       | 1.0             | 0.9420          |
+| 4.5070 | 3200 | 0.9371        | -           | -            | -               | -                   | -                  | -         | -               | -               |
+| 4.7887 | 3400 | 1.3572        | -           | -            | -               | -                   | -                  | -         | -               | -               |
+| 4.9296 | 3500 | -             | 0.0104      | 0.0057       | 5.6213          | 0.1539              | 0.2141             | 1.0       | 1.0             | 0.9429          |
+| 5.0    | 3550 | -             | -           | -            | -               | -                   | -                  | 1.0       | 1.0             | 0.9431          |
 ### Framework Versions
 }
 ```
+#### CoSENTLoss
+```bibtex
+@online{kexuefm-8847,
+    title={CoSENT: A more efficient sentence vector scheme than Sentence-BERT},
+    author={Su Jianlin},
+    year={2022},
+    month={Jan},
+    url={https://kexue.fm/archives/8847},
+}
+```
+#### MultipleNegativesRankingLoss
+```bibtex
+@misc{henderson2017efficient,
+    title={Efficient Natural Language Response Suggestion for Smart Reply},
+    author={Matthew Henderson and Rami Al-Rfou and Brian Strope and Yun-hsuan Sung and Laszlo Lukacs and Ruiqi Guo and Sanjiv Kumar and Balint Miklos and Ray Kurzweil},
+    year={2017},
+    eprint={1705.00652},
+    archivePrefix={arXiv},
+    primaryClass={cs.CL}
+}
+```
 <!--
 ## Glossary

model.safetensors CHANGED Viewed

@@ -1,3 +1,3 @@
 version https://git-lfs.github.com/spec/v1
-oid sha256:91264ff498242e97ca4fc6e8ecc2f4ff2a58184da00679f1d11ffb271f8478af
 size 470637416

 version https://git-lfs.github.com/spec/v1
+oid sha256:06de7179a076ef54737d05a716f4e621e3078a7b83a92970e3eaf55dab0ed0a4
 size 470637416

runs/Nov18_22-34-49_DESKTOP-T51O3H3/events.out.tfevents.1731944093.DESKTOP-T51O3H3.12064.0 ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:917f900e27a1ccca362cf5f9d02606c0f793ff29960421ca5414fddc246f0340
+size 14276

runs/Nov18_22-37-55_DESKTOP-T51O3H3/events.out.tfevents.1731944278.DESKTOP-T51O3H3.22016.0 ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:188f029ba96424339a3f086e26e9e3b147445a60ad049cdc0829e3c1461cd5af
+size 22673

training_args.bin CHANGED Viewed

@@ -1,3 +1,3 @@
 version https://git-lfs.github.com/spec/v1
-oid sha256:d11f9ab3a3250493b2dbb54720bf0584090d555d37c3fdb130ce4fefcaaea6f6
-size 5624

 version https://git-lfs.github.com/spec/v1
+oid sha256:e7411dec48308d116a10ef6fbd6f62c73bce2ff79de0fb9a3d0033f372d3c79c
+size 5688