T18 Phase 1 Tier 1: model card
Browse files
README.md
CHANGED
|
@@ -130,11 +130,11 @@ LogReg wins on simplicity-tiebreak.
|
|
| 130 |
|
| 131 |
### Primary use case
|
| 132 |
|
| 133 |
-
Upstream gate in the CCR report pipeline. After the XGBoost page classifier flags pages as CCR, this model evaluates whether the parent document is actually a Declaration of Covenants worth running CCR extraction on. Decision band:
|
| 134 |
|
| 135 |
-
- **Score < 0.
|
| 136 |
-
- **Score >= 0.
|
| 137 |
-
- **0.
|
| 138 |
|
| 139 |
### Out-of-scope use
|
| 140 |
|
|
@@ -146,9 +146,9 @@ Upstream gate in the CCR report pipeline. After the XGBoost page classifier flag
|
|
| 146 |
|
| 147 |
### Calibration
|
| 148 |
|
| 149 |
-
ECE 0.
|
| 150 |
|
| 151 |
-
|
| 152 |
|
| 153 |
### Sample size
|
| 154 |
|
|
@@ -191,19 +191,42 @@ doc_vector = np.mean(page_vectors, axis=0).reshape(1, -1)
|
|
| 191 |
# Predict
|
| 192 |
score = model.predict_proba(doc_vector)[0, 1]
|
| 193 |
|
| 194 |
-
# Three-band decision
|
| 195 |
-
if score < 0.
|
| 196 |
decision = "REJECT" # confident not a Declaration; skip CCR pipeline
|
| 197 |
-
elif score >= 0.
|
| 198 |
decision = "FAST_PASS" # confident Declaration; bypass agentic validator
|
| 199 |
else:
|
| 200 |
decision = "ESCALATE" # ambiguous; run agentic detect_ccr
|
| 201 |
```
|
| 202 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 203 |
### Files in this repo
|
| 204 |
|
| 205 |
-
- `ccr_binary_logreg_tuned.joblib` — pickled dict containing `model` (sklearn LogisticRegression)
|
| 206 |
-
- `
|
|
|
|
| 207 |
|
| 208 |
## Training Procedure
|
| 209 |
|
|
@@ -254,4 +277,6 @@ Model artifacts are versioned via HuggingFace commit history. `config.json` incl
|
|
| 254 |
|
| 255 |
## Maintenance
|
| 256 |
|
| 257 |
-
This model is part of the T18 plan (CCR Upstream Input Hardening) in the GoverningDocs platform. See `plans/T18_CCR_UPSTREAM_INPUT_HARDENING_PLAN.md` (v2.
|
|
|
|
|
|
|
|
|
| 130 |
|
| 131 |
### Primary use case
|
| 132 |
|
| 133 |
+
Upstream gate in the CCR report pipeline. After the XGBoost page classifier flags pages as CCR, this model evaluates whether the parent document is actually a Declaration of Covenants worth running CCR extraction on. Decision band (recalibrated empirically — the original `(0.30, 0.85)` plan-time bands left FAST_PASS empty in production because real Declarations score 0.45-0.70 raw):
|
| 134 |
|
| 135 |
+
- **Score < 0.25**: confident NOT-CCR. Skip CCR pipeline entirely. Removes the document from CCR dispatch.
|
| 136 |
+
- **Score >= 0.55**: confident IS-CCR. Trust the classifier, fast-path bypasses the more expensive agentic `detect_ccr` validator.
|
| 137 |
+
- **0.25 <= Score < 0.55**: ambiguous. Escalate to agentic `detect_ccr` for a deeper look.
|
| 138 |
|
| 139 |
### Out-of-scope use
|
| 140 |
|
|
|
|
| 146 |
|
| 147 |
### Calibration
|
| 148 |
|
| 149 |
+
The raw LogReg artifact has ECE 0.19-0.28 on validation/test — predicted probabilities are systematically miscalibrated. The decision-band thresholds `(0.25, 0.55)` above are **empirically tuned on the production score distribution, not probability-calibrated**.
|
| 150 |
|
| 151 |
+
A separate isotonic calibrator artifact (`ccr_binary_isotonic_calibrator.joblib`) ships in the same repo and reduces test-set ECE from 0.278 to 0.087 (3.2x improvement). It is **purely additive metadata** — the production gate still consumes raw scores. Use the calibrator if you need probability-calibrated outputs for drift monitoring, signal combination with other classifiers, or user-facing confidence display. See the "Calibration Support" section below for details.
|
| 152 |
|
| 153 |
### Sample size
|
| 154 |
|
|
|
|
| 191 |
# Predict
|
| 192 |
score = model.predict_proba(doc_vector)[0, 1]
|
| 193 |
|
| 194 |
+
# Three-band decision (recalibrated production bands)
|
| 195 |
+
if score < 0.25:
|
| 196 |
decision = "REJECT" # confident not a Declaration; skip CCR pipeline
|
| 197 |
+
elif score >= 0.55:
|
| 198 |
decision = "FAST_PASS" # confident Declaration; bypass agentic validator
|
| 199 |
else:
|
| 200 |
decision = "ESCALATE" # ambiguous; run agentic detect_ccr
|
| 201 |
```
|
| 202 |
|
| 203 |
+
### Calibration Support
|
| 204 |
+
|
| 205 |
+
Optional isotonic calibrator (`ccr_binary_isotonic_calibrator.joblib`) maps raw scores to probability-calibrated outputs.
|
| 206 |
+
|
| 207 |
+
```python
|
| 208 |
+
calibrator_path = hf_hub_download(
|
| 209 |
+
repo_id="GoverningDocs/ccr-binary-logreg",
|
| 210 |
+
filename="ccr_binary_isotonic_calibrator.joblib",
|
| 211 |
+
)
|
| 212 |
+
cal_artifact = joblib.load(calibrator_path)
|
| 213 |
+
calibrator = cal_artifact["calibrator"]
|
| 214 |
+
|
| 215 |
+
# Apply isotonic to a raw score (cv="prefit" + method="isotonic" + binary
|
| 216 |
+
# fits on raw predict_proba outputs, so we can apply directly to a float)
|
| 217 |
+
inner = calibrator.calibrated_classifiers_[0].calibrators[0]
|
| 218 |
+
calibrated = float(inner.predict([score])[0])
|
| 219 |
+
```
|
| 220 |
+
|
| 221 |
+
**Caveats:**
|
| 222 |
+
- The shipped isotonic was fit on a small (~70-doc) validation split and produces approximately 3 plateau outputs (0.737, 0.833, 1.000). Treat calibrated scores as 3-level (low / med / high) confidence rather than fine-grained probabilities.
|
| 223 |
+
- The calibrator's `shipped_model_filename` field MUST match the model file you loaded. Cross-check before use to guard against artifact mismatch.
|
| 224 |
+
|
| 225 |
### Files in this repo
|
| 226 |
|
| 227 |
+
- `ccr_binary_logreg_tuned.joblib` — pickled dict containing `model` (sklearn LogisticRegression) and `config` (dict with `embedding_model`, `max_pages_per_doc`, `skip_boilerplate` flags). The `threshold` field (0.436) is a Phase 1 artifact; production uses bands, not a single threshold.
|
| 228 |
+
- `ccr_binary_isotonic_calibrator.joblib` — pickled dict containing `calibrator` (sklearn `CalibratedClassifierCV` with `cv="prefit"`, `method="isotonic"`), `shipped_model_filename` (paired model artifact), and ECE before/after metadata.
|
| 229 |
+
- `config.json` — JSON-readable summary of the model configuration, decision bands, and calibrator metadata.
|
| 230 |
|
| 231 |
## Training Procedure
|
| 232 |
|
|
|
|
| 277 |
|
| 278 |
## Maintenance
|
| 279 |
|
| 280 |
+
This model is part of the T18 plan (CCR Upstream Input Hardening) in the GoverningDocs platform. See `plans/T18_CCR_UPSTREAM_INPUT_HARDENING_PLAN.md` (v2.2.1, Completed) in the product repo for design rationale, alternatives considered (page-classifier retrain, agentic-only, signature patterns), and Phase 2 wire-in.
|
| 281 |
+
|
| 282 |
+
Calibrator artifact added per `plans/CCR_BINARY_ISOTONIC_RECALIBRATION_PLAN.md` (v1.4.0). Phase 1 findings: `experiments/setfit_ccr_binary/ISOTONIC_CALIBRATION_FINDINGS.md`.
|