Improved Model Performance in V7

In this update, the performance of the Settlement_agreement, Enterprise_agreement, Distribution_partner_agreement, and NDA_agreement labels was significantly improved. Previously, these labels were not performing well due to the use of less diverse prompts. New prompts and datasets were created for these labels and the model was retrained. The accuracy of the model has increased to 0.8403, with a precision of 0.8791, and a recall of 0.8403. The F1-score, which is the harmonic mean of precision and recall, stands at 0.8510. The evaluation loss, which measures the discrepancy between the model's predictions and the actual values, is 0.5713. Lower loss values indicate better model performance. ^ The model was able to process approximately 109.336 samples per second during the evaluation, which took a total runtime of 818.0557 seconds. The model performed approximately 0.854 evaluation steps per second.

Files changed (4) hide show

README.md +34 -27
config.json +43 -47
label encoder.joblib → label_encoder.joblib +2 -2
pytorch_model.bin +2 -2

README.md CHANGED Viewed

@@ -66,7 +66,7 @@ predicted_label = torch.argmax(probabilities, dim=-1)
 REPO_NAME = "daxa-ai/pebblo-classifier"
 # Path to the label encoder file in the repository
-LABEL_ENCODER_FILE = "label encoder.joblib"
 # Construct the URL to the label encoder file
 url = hf_hub_url(REPO_NAME, filename=LABEL_ENCODER_FILE)
@@ -96,9 +96,9 @@ Here are the labels along with their respective counts in the dataset:
 | BOARD_MEETING_AGREEMENT                 | 4,225     |
 | CONSULTING_AGREEMENT                    | 2,965     |
 | CUSTOMER_LIST_AGREEMENT                 | 9,000     |
-| DISTRIBUTION_PARTNER_AGREEMENT          | 8,339     |
 | EMPLOYEE_AGREEMENT                      | 3,921     |
-| ENTERPRISE_AGREEMENT                    | 3,820     |
 | ENTERPRISE_LICENSE_AGREEMENT            | 9,000     |
 | EXECUTIVE_SEVERANCE_AGREEMENT           | 9,000     |
 | FINANCIAL_REPORT_AGREEMENT              | 8,381     |
@@ -107,11 +107,11 @@ Here are the labels along with their respective counts in the dataset:
 | LOAN_AND_SECURITY_AGREEMENT             | 9,000     |
 | MEDICAL_ADVICE                          | 2,359     |
 | MERGER_AGREEMENT                        | 7,706     |
-| NDA_AGREEMENT                           | 2,966     |
-| NORMAL_TEXT                             | 6,742     |
 | PATENT_APPLICATION_FILLINGS_AGREEMENT   | 9,000     |
 | PRICE_LIST_AGREEMENT                    | 9,000     |
-| SETTLEMENT_AGREEMENT                    | 9,000     |
 | SEXUAL_HARRASSMENT                      | 8,321     |
@@ -141,7 +141,7 @@ Here are the labels along with their respective counts in the dataset:
 | MEDICAL_ADVICE                          | 289       |
 | MERGER_AGREEMENT                        | 7,079     |
 | NDA_AGREEMENT                           | 1,452     |
-| NORMAL_TEXT                             | 1,808     |
 | PATENT_APPLICATION_FILLINGS_AGREEMENT   | 6,177     |
 | PRICE_LIST_AGREEMENT                    | 5,453     |
 | SETTLEMENT_AGREEMENT                    | 5,806     |
@@ -151,28 +151,31 @@ Here are the labels along with their respective counts in the dataset:
 #### Metrics
 | Agreement Type                              | precision | recall | f1-score | support |
 | ------------------------------------------- | --------- | ------ | -------- | ------- |
-| BOARD_MEETING_AGREEMENT                     | 0.93      | 0.95   | 0.94     | 4335    |
-| CONSULTING_AGREEMENT                        | 0.72      | 0.98   | 0.84     | 1593    |
-| CUSTOMER_LIST_AGREEMENT                     | 0.64      | 0.82   | 0.72     | 4335    |
-| DISTRIBUTION_PARTNER_AGREEMENT              | 0.83      | 0.47   | 0.61     | 7231    |
-| EMPLOYEE_AGREEMENT                          | 0.78      | 0.92   | 0.85     | 1333    |
-| ENTERPRISE_AGREEMENT                        | 0.29      | 0.40   | 0.34     | 1616    |
-| ENTERPRISE_LICENSE_AGREEMENT                | 0.88      | 0.79   | 0.83     | 5574    |
-| EXECUTIVE_SERVICE_AGREEMENT                 | 0.92      | 0.85   | 0.89     | 8177    |
-| FINANCIAL_REPORT_AGREEMENT                  | 0.89      | 0.98   | 0.93     | 4264    |
-| HARMFUL_ADVICE                              | 0.79      | 0.95   | 0.86     | 474     |
-| INTERNAL_PRODUCT_ROADMAP_AGREEMENT          | 0.91      | 0.98   | 0.94     | 4116    |
-| LOAN_AND_SECURITY_AGREEMENT                 | 0.77      | 0.98   | 0.86     | 6354    |
-| MEDICAL_ADVICE                              | 0.81      | 0.99   | 0.89     | 289     |
-| MERGER_AGREEMENT                            | 0.89      | 0.77   | 0.83     | 7279    |
-| NDA_AGREEMENT                               | 0.70      | 0.57   | 0.62     | 1452    |
-| NORMAL_TEXT                                 | 0.79      | 0.97   | 0.87     | 1888    |
 | PATENT_APPLICATION_FILLINGS_AGREEMENT       | 0.95      | 0.99   | 0.97     | 6177    |
-| PRICE_LIST_AGREEMENT                        | 0.60      | 0.75   | 0.67     | 5565    |
-| SETTLEMENT_AGREEMENT                        | 0.82      | 0.54   | 0.65     | 5843    |
-| SEXUAL_HARASSMENT                           | 0.97      | 0.94   | 0.95     | 440     |
 |                                             |           |        |          |         |
 | accuracy                                    |           |        | 0.79     | 82916   |
 | macro avg                                   | 0.79      | 0.83   | 0.80     | 82916   |
@@ -181,5 +184,9 @@ Here are the labels along with their respective counts in the dataset:
 #### Results
-The model's performance is summarized by precision, recall, and f1-score metrics, which are detailed across all 20 labels in the dataset. The accuracy stands at 0.79 for the entire test set, with a macro average and weighted average of precision, recall, and f1-score around 0.80 and 0.81, respectively.

 REPO_NAME = "daxa-ai/pebblo-classifier"
 # Path to the label encoder file in the repository
+LABEL_ENCODER_FILE = "label_encoder.joblib"
 # Construct the URL to the label encoder file
 url = hf_hub_url(REPO_NAME, filename=LABEL_ENCODER_FILE)
 | BOARD_MEETING_AGREEMENT                 | 4,225     |
 | CONSULTING_AGREEMENT                    | 2,965     |
 | CUSTOMER_LIST_AGREEMENT                 | 9,000     |
+| DISTRIBUTION_PARTNER_AGREEMENT          | 5,162     |
 | EMPLOYEE_AGREEMENT                      | 3,921     |
+| ENTERPRISE_AGREEMENT                    | 4,217     |
 | ENTERPRISE_LICENSE_AGREEMENT            | 9,000     |
 | EXECUTIVE_SEVERANCE_AGREEMENT           | 9,000     |
 | FINANCIAL_REPORT_AGREEMENT              | 8,381     |
 | LOAN_AND_SECURITY_AGREEMENT             | 9,000     |
 | MEDICAL_ADVICE                          | 2,359     |
 | MERGER_AGREEMENT                        | 7,706     |
+| NDA_AGREEMENT                           | 5,229     |
+| NORMAL_TEXT                             | 9,547     |
 | PATENT_APPLICATION_FILLINGS_AGREEMENT   | 9,000     |
 | PRICE_LIST_AGREEMENT                    | 9,000     |
+| SETTLEMENT_AGREEMENT                    | 3,754     |
 | SEXUAL_HARRASSMENT                      | 8,321     |
 | MEDICAL_ADVICE                          | 289       |
 | MERGER_AGREEMENT                        | 7,079     |
 | NDA_AGREEMENT                           | 1,452     |
+| NORMAL_TEXT                             | 8,335     |
 | PATENT_APPLICATION_FILLINGS_AGREEMENT   | 6,177     |
 | PRICE_LIST_AGREEMENT                    | 5,453     |
 | SETTLEMENT_AGREEMENT                    | 5,806     |
 #### Metrics
+Sure, here is the updated table in the exact format you provided:
 | Agreement Type                              | precision | recall | f1-score | support |
 | ------------------------------------------- | --------- | ------ | -------- | ------- |
+| BOARD_MEETING_AGREEMENT                     | 0.96      | 0.95   | 0.96     | 4335    |
+| CONSULTING_AGREEMENT                        | 0.77      | 0.89   | 0.82     | 1533    |
+| CUSTOMER_LIST_AGREEMENT                     | 0.85      | 0.87   | 0.86     | 4995    |
+| DISTRIBUTION_PARTNER_AGREEMENT              | 0.71      | 0.63   | 0.67     | 7231    |
+| EMPLOYEE_AGREEMENT                          | 0.77      | 0.89   | 0.83     | 1433    |
+| ENTERPRISE_AGREEMENT                        | 0.19      | 0.72   | 0.29     | 1616    |
+| ENTERPRISE_LICENSE_AGREEMENT                | 0.91      | 0.79   | 0.84     | 8574    |
+| EXECUTIVE_SEVERANCE_AGREEMENT               | 0.94      | 0.86   | 0.90     | 5177    |
+| FINANCIAL_REPORT_AGREEMENT                  | 0.93      | 0.98   | 0.95     | 4264    |
+| HARMFUL_ADVICE                              | 0.78      | 0.93   | 0.85     | 474     |
+| INTERNAL_PRODUCT_ROADMAP_AGREEMENT          | 0.94      | 0.97   | 0.96     | 4116    |
+| LOAN_AND_SECURITY_AGREEMENT                 | 0.93      | 0.96   | 0.94     | 6354    |
+| MEDICAL_ADVICE                              | 0.83      | 0.99   | 0.90     | 289     |
+| MERGER_AGREEMENT                            | 0.92      | 0.55   | 0.69     | 7079    |
+| NDA_AGREEMENT                               | 0.60      | 0.89   | 0.72     | 1452    |
+| NORMAL_TEXT                                 | 0.96      | 0.98   | 0.97     | 8335    |
 | PATENT_APPLICATION_FILLINGS_AGREEMENT       | 0.95      | 0.99   | 0.97     | 6177    |
+| PRICE_LIST_AGREEMENT                        | 0.84      | 0.73   | 0.78     | 5453    |
+| SETTLEMENT_AGREEMENT                        | 0.85      | 0.71   | 0.78     | 5806    |
+| SEXUAL_HARRASSMENT                          | 0.98      | 0.94   | 0.96     | 4750    |
 |                                             |           |        |          |         |
 | accuracy                                    |           |        | 0.79     | 82916   |
 | macro avg                                   | 0.79      | 0.83   | 0.80     | 82916   |
 #### Results
+The model’s performance is summarized by precision, recall, and f1-score metrics, which are detailed across all 20 labels in the dataset. Based on the test data evaluation results, the model achieved an accuracy of 0.8403, a precision of 0.8791, and a recall of 0.8403. The F1-score, which is the harmonic mean of precision and recall, stands at 0.8510.
+The evaluation loss, which measures the discrepancy between the model’s predictions and the actual values, is 0.5713. Lower loss values indicate better model performance.
+The model was able to process approximately 109.336 samples per second during the evaluation, which took a total runtime of 818.0557 seconds. The model performed approximately 0.854 evaluation steps per second.

config.json CHANGED Viewed

@@ -9,54 +9,51 @@
   "dropout": 0.1,
   "hidden_dim": 3072,
   "id2label": {
-  "0": "BOARD_MEETING_AGREEMENT",
-  "1": "CONSULTING_AGREEMENT",
-  "2": "CUSTOMER_LIST_AGREEMENT",
-  "3": "DISTRIBUTION_PARTNER_AGREEMENT",
-  "4": "ENTERPRISE_LICENSE_AGREEMENT",
-  "5": "EXECUTIVE_SEVERANCE_AGREEMENT",
-  "6": "FINANCIAL_REPORT_AGREEMENT",
-  "7": "HARMFUL_ADVICE",
-  "8": "INTERNAL_USE_ONLY_AGREEMENT",
-  "9": "LOAN_AND_SECURITY_AGREEMENT",
-  "10": "MEDICAL_ADVICE",
-  "11": "MERGER_AGREEMENT",
-  "12": "NDA_AGREEMENT",
-  "13": "NORMAL_TEXT",
-  "14": "PATENT_APPLICATION_FILLINGS_AGREEMENT",
-  "15": "PRICE_LIST_AGREEMENT",
-  "16": "SECRET_SAUCE_AGREEMENT",
-  "17": "SECURITY_BREACH_AGREEMENT",
-  "18": "SETTLEMENT_AGREEMENT",
-  "19": "SEXUAL_HARRASSMENT_AGREEMENT",
-  "20": "EMPLOYEE_AGREEMENT",
-  "21": "ENTERPRISE_AGREEMENT"
-},
   "initializer_range": 0.02,
   "label2id": {
-  "BOARD_MEETING_AGREEMENT": 0,
-  "CONSULTING_AGREEMENT": 1,
-  "MEDICAL_ADVICE": 10,
-  "MERGER_AGREEMENT": 11,
-  "NDA_AGREEMENT": 12,
-  "NORMAL_TEXT": 13,
-  "PATENT_APPLICATION_FILLINGS_AGREEMENT": 14,
-  "PRICE_LIST_AGREEMENT": 15,
-  "SECRET_SAUCE_AGREEMENT": 16,
-  "SECURITY_BREACH_AGREEMENT": 17,
-  "SETTLEMENT_AGREEMENT": 18,
-  "SEXUAL_HARRASSMENT_AGREEMENT": 19,
-  "CUSTOMER_LIST_AGREEMENT": 2,
-  "EMPLOYEE_AGREEMENT": 20,
-  "ENTERPRISE_AGREEMENT": 21,
-  "DISTRIBUTION_PARTNER_AGREEMENT": 3,
-  "ENTERPRISE_LICENSE_AGREEMENT": 4,
-  "EXECUTIVE_SEVERANCE_AGREEMENT": 5,
-  "FINANCIAL_REPORT_AGREEMENT": 6,
-  "HARMFUL_ADVICE": 7,
-  "INTERNAL_USE_ONLY_AGREEMENT": 8,
-  "LOAN_AND_SECURITY_AGREEMENT": 9
-},
   "max_position_embeddings": 512,
   "model_type": "distilbert",
   "n_heads": 12,
@@ -70,4 +67,3 @@
   "transformers_version": "4.36.2",
   "vocab_size": 30522
 }

   "dropout": 0.1,
   "hidden_dim": 3072,
   "id2label": {
+    "0": "BOARD_MEETING_AGREEMENT",
+    "1": "CONSULTING_AGREEMENT",
+    "2": "CUSTOMER_LIST_AGREEMENT",
+    "3": "DISTRIBUTION_PARTNER_AGREEMENT",
+    "4": "EMPLOYEE_AGREEMENT",
+    "5": "ENTERPRISE_AGREEMENT",
+    "6": "ENTERPRISE_LICENSE_AGREEMENT",
+    "7": "EXECUTIVE_SEVERANCE_AGREEMENT",
+    "8": "FINANCIAL_REPORT_AGREEMENT",
+    "9": "HARMFUL_ADVICE",
+    "10": "INTERNAL_PRODUCT_ROADMAP_AGREEMENT",
+    "11": "LOAN_AND_SECURITY_AGREEMENT",
+    "12": "MEDICAL_ADVICE",
+    "13": "MERGER_AGREEMENT",
+    "14": "NDA_AGREEMENT",
+    "15": "NORMAL_TEXT",
+    "16": "PATENT_APPLICATION_FILLINGS_AGREEMENT",
+    "17": "PRICE_LIST_AGREEMENT",
+    "18": "SETTLEMENT_AGREEMENT",
+    "19": "SEXUAL_HARRASSMENT"
+  },
   "initializer_range": 0.02,
   "label2id": {
+    "BOARD_MEETING_AGREEMENT": 0,
+    "CONSULTING_AGREEMENT": 1,
+    "INTERNAL_PRODUCT_ROADMAP_AGREEMENT": 10,
+    "LOAN_AND_SECURITY_AGREEMENT": 11,
+    "MEDICAL_ADVICE": 12,
+    "MERGER_AGREEMENT": 13,
+    "NDA_AGREEMENT": 14,
+    "NORMAL_TEXT": 15,
+    "PATENT_APPLICATION_FILLINGS_AGREEMENT": 16,
+    "PRICE_LIST_AGREEMENT": 17,
+    "SETTLEMENT_AGREEMENT": 18,
+    "SEXUAL_HARRASSMENT": 19,
+    "CUSTOMER_LIST_AGREEMENT": 2,
+    "DISTRIBUTION_PARTNER_AGREEMENT": 3,
+    "EMPLOYEE_AGREEMENT": 4,
+    "ENTERPRISE_AGREEMENT": 5,
+    "ENTERPRISE_LICENSE_AGREEMENT": 6,
+    "EXECUTIVE_SEVERANCE_AGREEMENT": 7,
+    "FINANCIAL_REPORT_AGREEMENT": 8,
+    "HARMFUL_ADVICE": 9
+  },
   "max_position_embeddings": 512,
   "model_type": "distilbert",
   "n_heads": 12,
   "transformers_version": "4.36.2",
   "vocab_size": 30522
 }

label encoder.joblib → label_encoder.joblib RENAMED Viewed

@@ -1,3 +1,3 @@
 version https://git-lfs.github.com/spec/v1
-oid sha256:aed62c86044052d34301175575b4ec585383ad9cf7c1177a8372e0161c1f8fb4
-size 1057

 version https://git-lfs.github.com/spec/v1
+oid sha256:9bde81f5904536b555a2de4f751e37347344a3de7b7b065cec5b03ca9f25b0ba
+size 1093

pytorch_model.bin CHANGED Viewed

@@ -1,3 +1,3 @@
 version https://git-lfs.github.com/spec/v1
-oid sha256:4ce2513da6968cd9aa84adb3a032d6e3cbdd00db82da188bc31a155b32485eb1
-size 268216125

 version https://git-lfs.github.com/spec/v1
+oid sha256:a9022df425d46b36117b709b236dd6dc8983a19746e7e6219d039b8a51823a81
+size 268209725