File size: 1,722 Bytes
cb592f0 |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 |
---
license: apache-2.0
tags:
- question-answering
- complexity-classification
- distilbert
datasets:
- wesley7137/question_complexity_classification
---
# question-complexity-classifier
馃 Fine-tuned DistilBERT model for classifying question complexity (Simple vs Complex)
## Model Details
### Model Description
- **Architecture:** DistilBERT base uncased
- **Fine-tuned on:** Question Complexity Classification Dataset
- **Language:** English
- **License:** Apache 2.0
- **Max Sequence Length:** 128 tokens
## Uses
```python
from transformers import pipeline
classifier = pipeline(
"text-classification",
model="grahamaco/question-complexity-classifier",
tokenizer="grahamaco/question-complexity-classifier",
truncation=True,
max_length=128 # Matches training config
)
result = classifier("Explain quantum computing in simple terms")
# Output example: {'label': 'COMPLEX', 'score': 0.97}
```
## Training Details
- **Epochs:** 5
- **Batch Size:** 32 (global)
- **Learning Rate:** 2e-5
- **Train/Val/Test Split:** 80/10/10 (stratified)
- **Early Stopping:** Patience of 2 epochs
## Evaluation Results
| Metric | Value |
|--------|-------|
| Accuracy | 0.92 |
| F1 Score | 0.91 |
## Performance
| Metric | Value |
|--------|-------|
| Inference Latency | 15.2ms (CPU) |
| Throughput | 68.4 samples/sec (GPU) |
## Ethical Considerations
This model is intended for educational content classification only. Developers should:
- Regularly audit performance across different question types
- Monitor for unintended bias in complexity assessments
- Provide human-review mechanisms for high-stakes classifications
- Validate classifications against original context when used with RAG systems |