pebblo-classifier / README.md
sridhar-cd's picture
Update README with model card details (#5)
3afb3ae verified
---
license: mit
language:
- en
---
# Model Card for Model ID
This model card outlines the Pebblo Classifier, a machine learning system specialized in text classification. Developed by DAXA.AI, this model is adept at categorizing various agreement documents within organizational structures, trained on 20 distinct labels.
## Model Details
### Model Description
The Pebblo Classifier is a BERT-based model, fine-tuned from distilbert-base-uncased, targeting RAG (Retrieve-And-Generate) applications. It classifies text into categories such as "BOARD_MEETING_AGREEMENT," "CONSULTING_AGREEMENT," and others, streamlining document classification processes.
- **Developed by:** DAXA.AI
- **Funded by:** Open Source
- **Model type:** Classification model
- **Language(s) (NLP):** English
- **License:** MIT
- **Finetuned from model:** distilbert-base-uncased
### Model Sources
- **Repository:** [https://huggingface.co/daxa-ai/pebblo-classifier](https://huggingface.co/daxa-ai/pebblo-classifier?text=I+like+you.+I+love+you)
- **Demo:** [https://huggingface.co/spaces/daxa-ai/Daxa-Classifier](https://huggingface.co/spaces/daxa-ai/Daxa-Classifier)
## Uses
### Intended Use
The model is designed for direct application in document classification, capable of immediate deployment without additional fine-tuning.
### Recommendations
End-users should be cognizant of potential biases and limitations inherent in the model. For optimal use, understanding these aspects is recommended.
## How to Get Started with the Model
Use the code below to get started with the model.
```python
# Import necessary libraries
from transformers import AutoTokenizer, AutoModelForSequenceClassification
import torch
import joblib
from huggingface_hub import hf_hub_url, cached_download
# Load the tokenizer and model
tokenizer = AutoTokenizer.from_pretrained("daxa-ai/pebblo-classifier")
model = AutoModelForSequenceClassification.from_pretrained("daxa-ai/pebblo-classifier")
# Example text
text = "Please enter your text here."
encoded_input = tokenizer(text, return_tensors='pt')
output = model(**encoded_input)
# Apply softmax to the logits
probabilities = torch.nn.functional.softmax(output.logits, dim=-1)
# Get the predicted label
predicted_label = torch.argmax(probabilities, dim=-1)
# URL of your Hugging Face model repository
REPO_NAME = "daxa-ai/pebblo-classifier"
# Path to the label encoder file in the repository
LABEL_ENCODER_FILE = "label encoder.joblib"
# Construct the URL to the label encoder file
url = hf_hub_url(REPO_NAME, filename=LABEL_ENCODER_FILE)
# Download and cache the label encoder file
filename = cached_download(url)
# Load the label encoder
label_encoder = joblib.load(filename)
# Decode the predicted label
decoded_label = label_encoder.inverse_transform(predicted_label.numpy())
print(decoded_label)
```
## Training Details
### Training Data
The training dataset consists of 131,771 entries, with 20 unique labels. The labels span various document types, with instances distributed across three text sizes (128 ± x, 256 ± x, and 512 ± x words; x varies within 20).
Here are the labels along with their respective counts in the dataset:
| Agreement Type | Instances |
| --------------------------------------- | --------- |
| BOARD_MEETING_AGREEMENT | 4,225 |
| CONSULTING_AGREEMENT | 2,965 |
| CUSTOMER_LIST_AGREEMENT | 9,000 |
| DISTRIBUTION_PARTNER_AGREEMENT | 8,339 |
| EMPLOYEE_AGREEMENT | 3,921 |
| ENTERPRISE_AGREEMENT | 3,820 |
| ENTERPRISE_LICENSE_AGREEMENT | 9,000 |
| EXECUTIVE_SEVERANCE_AGREEMENT | 9,000 |
| FINANCIAL_REPORT_AGREEMENT | 8,381 |
| HARMFUL_ADVICE | 2,025 |
| INTERNAL_PRODUCT_ROADMAP_AGREEMENT | 7,037 |
| LOAN_AND_SECURITY_AGREEMENT | 9,000 |
| MEDICAL_ADVICE | 2,359 |
| MERGER_AGREEMENT | 7,706 |
| NDA_AGREEMENT | 2,966 |
| NORMAL_TEXT | 6,742 |
| PATENT_APPLICATION_FILLINGS_AGREEMENT | 9,000 |
| PRICE_LIST_AGREEMENT | 9,000 |
| SETTLEMENT_AGREEMENT | 9,000 |
| SEXUAL_HARRASSMENT | 8,321 |
## Evaluation
### Testing Data & Metrics
#### Testing Data
Evaluation was performed on a dataset of 82,917 entries with a temperature range of 1-1.25 for randomness.
Here are the labels along with their respective counts in the dataset:
| Agreement Type | Instances |
| --------------------------------------- | --------- |
| BOARD_MEETING_AGREEMENT | 4,335 |
| CONSULTING_AGREEMENT | 1,533 |
| CUSTOMER_LIST_AGREEMENT | 4,995 |
| DISTRIBUTION_PARTNER_AGREEMENT | 7,231 |
| EMPLOYEE_AGREEMENT | 1,433 |
| ENTERPRISE_AGREEMENT | 1,616 |
| ENTERPRISE_LICENSE_AGREEMENT | 8,574 |
| EXECUTIVE_SEVERANCE_AGREEMENT | 5,177 |
| FINANCIAL_REPORT_AGREEMENT | 4,264 |
| HARMFUL_ADVICE | 474 |
| INTERNAL_PRODUCT_ROADMAP_AGREEMENT | 4,116 |
| LOAN_AND_SECURITY_AGREEMENT | 6,354 |
| MEDICAL_ADVICE | 289 |
| MERGER_AGREEMENT | 7,079 |
| NDA_AGREEMENT | 1,452 |
| NORMAL_TEXT | 1,808 |
| PATENT_APPLICATION_FILLINGS_AGREEMENT | 6,177 |
| PRICE_LIST_AGREEMENT | 5,453 |
| SETTLEMENT_AGREEMENT | 5,806 |
| SEXUAL_HARRASSMENT | 4,750 |
#### Metrics
| Agreement Type | precision | recall | f1-score | support |
| ------------------------------------------- | --------- | ------ | -------- | ------- |
| BOARD_MEETING_AGREEMENT | 0.93 | 0.95 | 0.94 | 4335 |
| CONSULTING_AGREEMENT | 0.72 | 0.98 | 0.84 | 1593 |
| CUSTOMER_LIST_AGREEMENT | 0.64 | 0.82 | 0.72 | 4335 |
| DISTRIBUTION_PARTNER_AGREEMENT | 0.83 | 0.47 | 0.61 | 7231 |
| EMPLOYEE_AGREEMENT | 0.78 | 0.92 | 0.85 | 1333 |
| ENTERPRISE_AGREEMENT | 0.29 | 0.40 | 0.34 | 1616 |
| ENTERPRISE_LICENSE_AGREEMENT | 0.88 | 0.79 | 0.83 | 5574 |
| EXECUTIVE_SERVICE_AGREEMENT | 0.92 | 0.85 | 0.89 | 8177 |
| FINANCIAL_REPORT_AGREEMENT | 0.89 | 0.98 | 0.93 | 4264 |
| HARMFUL_ADVICE | 0.79 | 0.95 | 0.86 | 474 |
| INTERNAL_PRODUCT_ROADMAP_AGREEMENT | 0.91 | 0.98 | 0.94 | 4116 |
| LOAN_AND_SECURITY_AGREEMENT | 0.77 | 0.98 | 0.86 | 6354 |
| MEDICAL_ADVICE | 0.81 | 0.99 | 0.89 | 289 |
| MERGER_AGREEMENT | 0.89 | 0.77 | 0.83 | 7279 |
| NDA_AGREEMENT | 0.70 | 0.57 | 0.62 | 1452 |
| NORMAL_TEXT | 0.79 | 0.97 | 0.87 | 1888 |
| PATENT_APPLICATION_FILLINGS_AGREEMENT | 0.95 | 0.99 | 0.97 | 6177 |
| PRICE_LIST_AGREEMENT | 0.60 | 0.75 | 0.67 | 5565 |
| SETTLEMENT_AGREEMENT | 0.82 | 0.54 | 0.65 | 5843 |
| SEXUAL_HARASSMENT | 0.97 | 0.94 | 0.95 | 440 |
| | | | | |
| accuracy | | | 0.79 | 82916 |
| macro avg | 0.79 | 0.83 | 0.80 | 82916 |
| weighted avg | 0.83 | 0.81 | 0.81 | 82916 |
#### Results
The model's performance is summarized by precision, recall, and f1-score metrics, which are detailed across all 20 labels in the dataset. The accuracy stands at 0.79 for the entire test set, with a macro average and weighted average of precision, recall, and f1-score around 0.80 and 0.81, respectively.