grahamaco's picture
Upload README.md with huggingface_hub
cb592f0 verified
---
license: apache-2.0
tags:
- question-answering
- complexity-classification
- distilbert
datasets:
- wesley7137/question_complexity_classification
---
# question-complexity-classifier
馃 Fine-tuned DistilBERT model for classifying question complexity (Simple vs Complex)
## Model Details
### Model Description
- **Architecture:** DistilBERT base uncased
- **Fine-tuned on:** Question Complexity Classification Dataset
- **Language:** English
- **License:** Apache 2.0
- **Max Sequence Length:** 128 tokens
## Uses
```python
from transformers import pipeline
classifier = pipeline(
"text-classification",
model="grahamaco/question-complexity-classifier",
tokenizer="grahamaco/question-complexity-classifier",
truncation=True,
max_length=128 # Matches training config
)
result = classifier("Explain quantum computing in simple terms")
# Output example: {'label': 'COMPLEX', 'score': 0.97}
```
## Training Details
- **Epochs:** 5
- **Batch Size:** 32 (global)
- **Learning Rate:** 2e-5
- **Train/Val/Test Split:** 80/10/10 (stratified)
- **Early Stopping:** Patience of 2 epochs
## Evaluation Results
| Metric | Value |
|--------|-------|
| Accuracy | 0.92 |
| F1 Score | 0.91 |
## Performance
| Metric | Value |
|--------|-------|
| Inference Latency | 15.2ms (CPU) |
| Throughput | 68.4 samples/sec (GPU) |
## Ethical Considerations
This model is intended for educational content classification only. Developers should:
- Regularly audit performance across different question types
- Monitor for unintended bias in complexity assessments
- Provide human-review mechanisms for high-stakes classifications
- Validate classifications against original context when used with RAG systems