sridhar-cd Tihsrah-CD commited on
Commit
3afb3ae
1 Parent(s): 2f3acfa

Update README with model card details (#5)

Browse files

- Updated README.md with Model Card (ebbece91a1dee4a2fbf8b5e4d1f3a9eee02c44ac)


Co-authored-by: Harshit <Tihsrah-CD@users.noreply.huggingface.co>

Files changed (1) hide show
  1. README.md +182 -0
README.md CHANGED
@@ -1,3 +1,185 @@
1
  ---
2
  license: mit
 
 
3
  ---
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
  ---
2
  license: mit
3
+ language:
4
+ - en
5
  ---
6
+
7
+ # Model Card for Model ID
8
+
9
+ This model card outlines the Pebblo Classifier, a machine learning system specialized in text classification. Developed by DAXA.AI, this model is adept at categorizing various agreement documents within organizational structures, trained on 20 distinct labels.
10
+
11
+ ## Model Details
12
+
13
+ ### Model Description
14
+
15
+ The Pebblo Classifier is a BERT-based model, fine-tuned from distilbert-base-uncased, targeting RAG (Retrieve-And-Generate) applications. It classifies text into categories such as "BOARD_MEETING_AGREEMENT," "CONSULTING_AGREEMENT," and others, streamlining document classification processes.
16
+
17
+ - **Developed by:** DAXA.AI
18
+ - **Funded by:** Open Source
19
+ - **Model type:** Classification model
20
+ - **Language(s) (NLP):** English
21
+ - **License:** MIT
22
+ - **Finetuned from model:** distilbert-base-uncased
23
+
24
+ ### Model Sources
25
+
26
+ - **Repository:** [https://huggingface.co/daxa-ai/pebblo-classifier](https://huggingface.co/daxa-ai/pebblo-classifier?text=I+like+you.+I+love+you)
27
+ - **Demo:** [https://huggingface.co/spaces/daxa-ai/Daxa-Classifier](https://huggingface.co/spaces/daxa-ai/Daxa-Classifier)
28
+
29
+ ## Uses
30
+
31
+ ### Intended Use
32
+
33
+ The model is designed for direct application in document classification, capable of immediate deployment without additional fine-tuning.
34
+
35
+ ### Recommendations
36
+
37
+ End-users should be cognizant of potential biases and limitations inherent in the model. For optimal use, understanding these aspects is recommended.
38
+
39
+ ## How to Get Started with the Model
40
+
41
+ Use the code below to get started with the model.
42
+
43
+ ```python
44
+ # Import necessary libraries
45
+ from transformers import AutoTokenizer, AutoModelForSequenceClassification
46
+ import torch
47
+ import joblib
48
+ from huggingface_hub import hf_hub_url, cached_download
49
+
50
+ # Load the tokenizer and model
51
+ tokenizer = AutoTokenizer.from_pretrained("daxa-ai/pebblo-classifier")
52
+ model = AutoModelForSequenceClassification.from_pretrained("daxa-ai/pebblo-classifier")
53
+
54
+ # Example text
55
+ text = "Please enter your text here."
56
+ encoded_input = tokenizer(text, return_tensors='pt')
57
+ output = model(**encoded_input)
58
+
59
+ # Apply softmax to the logits
60
+ probabilities = torch.nn.functional.softmax(output.logits, dim=-1)
61
+
62
+ # Get the predicted label
63
+ predicted_label = torch.argmax(probabilities, dim=-1)
64
+
65
+ # URL of your Hugging Face model repository
66
+ REPO_NAME = "daxa-ai/pebblo-classifier"
67
+
68
+ # Path to the label encoder file in the repository
69
+ LABEL_ENCODER_FILE = "label encoder.joblib"
70
+
71
+ # Construct the URL to the label encoder file
72
+ url = hf_hub_url(REPO_NAME, filename=LABEL_ENCODER_FILE)
73
+
74
+ # Download and cache the label encoder file
75
+ filename = cached_download(url)
76
+
77
+ # Load the label encoder
78
+ label_encoder = joblib.load(filename)
79
+
80
+ # Decode the predicted label
81
+ decoded_label = label_encoder.inverse_transform(predicted_label.numpy())
82
+
83
+ print(decoded_label)
84
+
85
+ ```
86
+
87
+ ## Training Details
88
+
89
+ ### Training Data
90
+
91
+ The training dataset consists of 131,771 entries, with 20 unique labels. The labels span various document types, with instances distributed across three text sizes (128 ± x, 256 ± x, and 512 ± x words; x varies within 20).
92
+ Here are the labels along with their respective counts in the dataset:
93
+
94
+ | Agreement Type | Instances |
95
+ | --------------------------------------- | --------- |
96
+ | BOARD_MEETING_AGREEMENT | 4,225 |
97
+ | CONSULTING_AGREEMENT | 2,965 |
98
+ | CUSTOMER_LIST_AGREEMENT | 9,000 |
99
+ | DISTRIBUTION_PARTNER_AGREEMENT | 8,339 |
100
+ | EMPLOYEE_AGREEMENT | 3,921 |
101
+ | ENTERPRISE_AGREEMENT | 3,820 |
102
+ | ENTERPRISE_LICENSE_AGREEMENT | 9,000 |
103
+ | EXECUTIVE_SEVERANCE_AGREEMENT | 9,000 |
104
+ | FINANCIAL_REPORT_AGREEMENT | 8,381 |
105
+ | HARMFUL_ADVICE | 2,025 |
106
+ | INTERNAL_PRODUCT_ROADMAP_AGREEMENT | 7,037 |
107
+ | LOAN_AND_SECURITY_AGREEMENT | 9,000 |
108
+ | MEDICAL_ADVICE | 2,359 |
109
+ | MERGER_AGREEMENT | 7,706 |
110
+ | NDA_AGREEMENT | 2,966 |
111
+ | NORMAL_TEXT | 6,742 |
112
+ | PATENT_APPLICATION_FILLINGS_AGREEMENT | 9,000 |
113
+ | PRICE_LIST_AGREEMENT | 9,000 |
114
+ | SETTLEMENT_AGREEMENT | 9,000 |
115
+ | SEXUAL_HARRASSMENT | 8,321 |
116
+
117
+
118
+
119
+ ## Evaluation
120
+
121
+ ### Testing Data & Metrics
122
+
123
+ #### Testing Data
124
+ Evaluation was performed on a dataset of 82,917 entries with a temperature range of 1-1.25 for randomness.
125
+ Here are the labels along with their respective counts in the dataset:
126
+
127
+ | Agreement Type | Instances |
128
+ | --------------------------------------- | --------- |
129
+ | BOARD_MEETING_AGREEMENT | 4,335 |
130
+ | CONSULTING_AGREEMENT | 1,533 |
131
+ | CUSTOMER_LIST_AGREEMENT | 4,995 |
132
+ | DISTRIBUTION_PARTNER_AGREEMENT | 7,231 |
133
+ | EMPLOYEE_AGREEMENT | 1,433 |
134
+ | ENTERPRISE_AGREEMENT | 1,616 |
135
+ | ENTERPRISE_LICENSE_AGREEMENT | 8,574 |
136
+ | EXECUTIVE_SEVERANCE_AGREEMENT | 5,177 |
137
+ | FINANCIAL_REPORT_AGREEMENT | 4,264 |
138
+ | HARMFUL_ADVICE | 474 |
139
+ | INTERNAL_PRODUCT_ROADMAP_AGREEMENT | 4,116 |
140
+ | LOAN_AND_SECURITY_AGREEMENT | 6,354 |
141
+ | MEDICAL_ADVICE | 289 |
142
+ | MERGER_AGREEMENT | 7,079 |
143
+ | NDA_AGREEMENT | 1,452 |
144
+ | NORMAL_TEXT | 1,808 |
145
+ | PATENT_APPLICATION_FILLINGS_AGREEMENT | 6,177 |
146
+ | PRICE_LIST_AGREEMENT | 5,453 |
147
+ | SETTLEMENT_AGREEMENT | 5,806 |
148
+ | SEXUAL_HARRASSMENT | 4,750 |
149
+
150
+
151
+
152
+ #### Metrics
153
+
154
+ | Agreement Type | precision | recall | f1-score | support |
155
+ | ------------------------------------------- | --------- | ------ | -------- | ------- |
156
+ | BOARD_MEETING_AGREEMENT | 0.93 | 0.95 | 0.94 | 4335 |
157
+ | CONSULTING_AGREEMENT | 0.72 | 0.98 | 0.84 | 1593 |
158
+ | CUSTOMER_LIST_AGREEMENT | 0.64 | 0.82 | 0.72 | 4335 |
159
+ | DISTRIBUTION_PARTNER_AGREEMENT | 0.83 | 0.47 | 0.61 | 7231 |
160
+ | EMPLOYEE_AGREEMENT | 0.78 | 0.92 | 0.85 | 1333 |
161
+ | ENTERPRISE_AGREEMENT | 0.29 | 0.40 | 0.34 | 1616 |
162
+ | ENTERPRISE_LICENSE_AGREEMENT | 0.88 | 0.79 | 0.83 | 5574 |
163
+ | EXECUTIVE_SERVICE_AGREEMENT | 0.92 | 0.85 | 0.89 | 8177 |
164
+ | FINANCIAL_REPORT_AGREEMENT | 0.89 | 0.98 | 0.93 | 4264 |
165
+ | HARMFUL_ADVICE | 0.79 | 0.95 | 0.86 | 474 |
166
+ | INTERNAL_PRODUCT_ROADMAP_AGREEMENT | 0.91 | 0.98 | 0.94 | 4116 |
167
+ | LOAN_AND_SECURITY_AGREEMENT | 0.77 | 0.98 | 0.86 | 6354 |
168
+ | MEDICAL_ADVICE | 0.81 | 0.99 | 0.89 | 289 |
169
+ | MERGER_AGREEMENT | 0.89 | 0.77 | 0.83 | 7279 |
170
+ | NDA_AGREEMENT | 0.70 | 0.57 | 0.62 | 1452 |
171
+ | NORMAL_TEXT | 0.79 | 0.97 | 0.87 | 1888 |
172
+ | PATENT_APPLICATION_FILLINGS_AGREEMENT | 0.95 | 0.99 | 0.97 | 6177 |
173
+ | PRICE_LIST_AGREEMENT | 0.60 | 0.75 | 0.67 | 5565 |
174
+ | SETTLEMENT_AGREEMENT | 0.82 | 0.54 | 0.65 | 5843 |
175
+ | SEXUAL_HARASSMENT | 0.97 | 0.94 | 0.95 | 440 |
176
+ | | | | | |
177
+ | accuracy | | | 0.79 | 82916 |
178
+ | macro avg | 0.79 | 0.83 | 0.80 | 82916 |
179
+ | weighted avg | 0.83 | 0.81 | 0.81 | 82916 |
180
+
181
+
182
+ #### Results
183
+
184
+ The model's performance is summarized by precision, recall, and f1-score metrics, which are detailed across all 20 labels in the dataset. The accuracy stands at 0.79 for the entire test set, with a macro average and weighted average of precision, recall, and f1-score around 0.80 and 0.81, respectively.
185
+