Upload README.md with huggingface_hub
Browse files
README.md
CHANGED
|
@@ -37,9 +37,9 @@ model-index:
|
|
| 37 |
- type: pii-detection-rate
|
| 38 |
value: 1.000
|
| 39 |
name: PII Detection Rate
|
| 40 |
-
- type: completeness
|
| 41 |
value: 0.650
|
| 42 |
-
name: Completeness
|
| 43 |
- type: semantic-preservation
|
| 44 |
value: 0.811
|
| 45 |
name: Semantic Preservation
|
|
@@ -71,9 +71,10 @@ model-index:
|
|
| 71 |
|
| 72 |
### Key Features
|
| 73 |
- π **Privacy-First**: Removes personal identifiers automatically
|
| 74 |
-
- π― **
|
|
|
|
| 75 |
- π **Compact Size**: 136MB (Q8_0 quantized)
|
| 76 |
-
- β‘ **Fast Inference**:
|
| 77 |
- π **Multi-Domain**: Works across medical, legal, HR, and general text
|
| 78 |
- π **Local Processing**: No data sent to external servers
|
| 79 |
|
|
@@ -191,17 +192,30 @@ print(result)
|
|
| 191 |
|
| 192 |
| Metric | Score | Description |
|
| 193 |
|--------|-------|-------------|
|
| 194 |
-
| **PII Detection Rate** | **100%** | **
|
| 195 |
-
| **Completeness
|
| 196 |
| **Semantic Preservation** | **81.1%** | **How well original meaning is preserved** |
|
| 197 |
| **Average Latency** | **477ms** | **Response time performance** |
|
| 198 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 199 |
### Performance Insights
|
| 200 |
|
| 201 |
-
- β
**Perfect PII Detection**: 100%
|
| 202 |
-
- β
**Strong
|
| 203 |
-
- β
**
|
| 204 |
-
- β
**
|
|
|
|
| 205 |
|
| 206 |
## ποΈ Technical Details
|
| 207 |
|
|
@@ -431,7 +445,7 @@ If you use DeId-Small in your research, please cite:
|
|
| 431 |
|
| 432 |
- **Website**: [minibase.ai](https://minibase.ai)
|
| 433 |
- **Discord**: [Join our community](https://discord.com/invite/BrJn4D2Guh)
|
| 434 |
-
- **Documentation**: [docs.minibase.ai](https://
|
| 435 |
|
| 436 |
## π License
|
| 437 |
|
|
|
|
| 37 |
- type: pii-detection-rate
|
| 38 |
value: 1.000
|
| 39 |
name: PII Detection Rate
|
| 40 |
+
- type: pii-removal-completeness
|
| 41 |
value: 0.650
|
| 42 |
+
name: PII Removal Completeness
|
| 43 |
- type: semantic-preservation
|
| 44 |
value: 0.811
|
| 45 |
name: Semantic Preservation
|
|
|
|
| 71 |
|
| 72 |
### Key Features
|
| 73 |
- π **Privacy-First**: Removes personal identifiers automatically
|
| 74 |
+
- π― **Perfect PII Detection**: 100% detection rate when PII is present
|
| 75 |
+
- β
**Strong PII Removal**: 65% of texts completely de-identified
|
| 76 |
- π **Compact Size**: 136MB (Q8_0 quantized)
|
| 77 |
+
- β‘ **Fast Inference**: 477ms average response time
|
| 78 |
- π **Multi-Domain**: Works across medical, legal, HR, and general text
|
| 79 |
- π **Local Processing**: No data sent to external servers
|
| 80 |
|
|
|
|
| 192 |
|
| 193 |
| Metric | Score | Description |
|
| 194 |
|--------|-------|-------------|
|
| 195 |
+
| **PII Detection Rate** | **100%** | **Model responds to PII presence with placeholders** |
|
| 196 |
+
| **PII Removal Completeness** | **65%** | **Successfully removes all detectable PII from output** |
|
| 197 |
| **Semantic Preservation** | **81.1%** | **How well original meaning is preserved** |
|
| 198 |
| **Average Latency** | **477ms** | **Response time performance** |
|
| 199 |
|
| 200 |
+
### Understanding the Metrics
|
| 201 |
+
|
| 202 |
+
**PII Detection Rate (100%)**: Measures whether the model recognizes when personal information is present in the input text and responds by generating placeholders. This is a measure of the model's sensitivity to PII presence.
|
| 203 |
+
|
| 204 |
+
**PII Removal Completeness (65%)**: Measures whether the model successfully removes ALL detectable personal identifiers from the output text. This is a strict measure - even one remaining PII element (like a name, date, or phone number) counts as incomplete.
|
| 205 |
+
|
| 206 |
+
**Why 65% is Strong Performance**: Achieving 100% completeness is extremely challenging because:
|
| 207 |
+
- PII can be contextually important (e.g., "Dr. Smith" in medical records)
|
| 208 |
+
- Some PII might be embedded in complex ways
|
| 209 |
+
- Perfect removal could harm text coherence or meaning
|
| 210 |
+
- 65% completeness means the model reliably sanitizes most texts while preserving utility
|
| 211 |
+
|
| 212 |
### Performance Insights
|
| 213 |
|
| 214 |
+
- β
**Perfect PII Detection**: 100% of texts with PII trigger placeholder generation
|
| 215 |
+
- β
**Strong PII Removal**: 65% of outputs are completely free of detectable PII
|
| 216 |
+
- β
**Excellent Semantic Preservation**: 81.1% meaning retention during de-identification
|
| 217 |
+
- β
**Fast Inference**: 477ms average response time
|
| 218 |
+
- β
**Unified Performance**: Consistent across medical, legal, HR, and general text
|
| 219 |
|
| 220 |
## ποΈ Technical Details
|
| 221 |
|
|
|
|
| 445 |
|
| 446 |
- **Website**: [minibase.ai](https://minibase.ai)
|
| 447 |
- **Discord**: [Join our community](https://discord.com/invite/BrJn4D2Guh)
|
| 448 |
+
- **Documentation**: [docs.minibase.ai](https://help.minibase.ai)
|
| 449 |
|
| 450 |
## π License
|
| 451 |
|