episodic-ingestion-compiler — ModernBERT-large + span + LDAM (Phase 4)

Fine-tuned from answerdotai/ModernBERT-large for the ingestion-model-spec-v1 task in episodic-rs. Produces the information-bearing subset of a claim candidate; a deterministic wrapper (available in Python at scripts/ingestion_wrapper.py OR natively in Rust via episodic-ingestion-compiler-rust-bundle) assembles the full JSON.

Phase 4 = Phase 2 → 3 → 4 pipeline

Phase 2A — ModernBERT-large + 5 heads (claim_present, predicate, subject_type, confidence, span). Trained 5000 steps on 94k labeled conversational blurbs.
Phase 3 — expanded training set with autonomous-agent traces (tool_call → tool_result joins), tightened deterministic labeler. Retrained from scratch. decided recall: 0.833 → 0.917.
Phase 4 (this) — LDAM-DRW head-only fine-tune (Cao et al. 2019, arXiv:1906.07413). Backbone + 4 non-predicate heads frozen; predicate head retrained with label-distribution-aware margin loss for 500 steps (uniform CE for 400 + class-balanced for 100).

Test-set metrics (13,884 held-out blurbs)

Metric	P3 v3	P4 LDAM
claim F1	0.962	0.962
claim precision	0.962	0.962
claim recall	0.962	0.962
schema_validity	1.000	1.000
abstention_accuracy	0.976	0.976
predicate_accuracy_macro	0.959	0.966
span_token_IoU	0.969	0.969
confidence MAE	0.022	0.022

Per-predicate recall (P3 v3 → P4 LDAM)

predicate	P3 v3	P4 LDAM	Δ
has_next_action	0.860	0.907	+0.047
has_quality_finding	0.883	0.917	+0.033
has_open_question	0.949	0.959	+0.010
decided	0.917	0.917	±
blocked_by	1.000	1.000	±
created_file	1.000	1.000	±
ran_command	0.997	0.997	±
touched_file	0.986	0.986	±
has_status	0.982	0.982	±
has_goal	0.983	0.978	−0.005
has_constraint	0.990	0.979	−0.011

Calibration

Isotonic regression on 3,848 positive cal rows.

Base argmax accuracy: 0.9826
ECE raw → calibrated: 0.299 → 0.000

Calibrator shipped at calibrator.joblib.

LDAM fine-tune details

Trainable params: 0.002% of total (predicate head only: Linear(1024, 21) with bias)
LDAM max margin: 0.5 (Δ_j ∝ n_j^(-1/4), normalized)
DRW phase: class-balanced Effective-Number weights (β=0.9999) activated at step 400/500
Optimizer: AdamW, lr 1e-4, linear warmup 50 steps
Elapsed: 132s on RTX 5090
Backbone + claim_present + subject_type + confidence + span heads all FROZEN

Loading from Python

import torch, torch.nn as nn
from huggingface_hub import hf_hub_download
from transformers import AutoModel, AutoTokenizer
import joblib

class IngestionEncoder(nn.Module):
    def __init__(self, backbone, load_dtype=torch.float32):
        super().__init__()
        self.backbone = AutoModel.from_pretrained(backbone, torch_dtype=load_dtype)
        h = self.backbone.config.hidden_size
        self.dropout = nn.Dropout(0.1)
        self.head_claim_present = nn.Linear(h, 1)
        self.head_predicate = nn.Linear(h, 21)
        self.head_subject_type = nn.Linear(h, 13)
        self.head_confidence = nn.Linear(h, 1)
        self.head_span = nn.Linear(h, 2)

    def forward(self, input_ids, attention_mask):
        out = self.backbone(input_ids=input_ids, attention_mask=attention_mask)
        h = out.last_hidden_state
        pooled = self.dropout(h[:, 0])
        span_logits = self.head_span(self.dropout(h))
        s_log, e_log = span_logits.split(1, dim=-1)
        very_neg = torch.finfo(s_log.dtype).min
        mask = (attention_mask == 0)
        return {
            "claim_logit": self.head_claim_present(pooled).squeeze(-1),
            "predicate_logits": self.head_predicate(pooled),
            "subject_type_logits": self.head_subject_type(pooled),
            "confidence_pred": torch.sigmoid(self.head_confidence(pooled).squeeze(-1)),
            "span_start_logits": s_log.squeeze(-1).masked_fill(mask, very_neg),
            "span_end_logits": e_log.squeeze(-1).masked_fill(mask, very_neg),
        }

ckpt_path = hf_hub_download("Avifenesh/episodic-ingestion-compiler-modernbert-large-span-5000", "best.pt")
cal_path = hf_hub_download("Avifenesh/episodic-ingestion-compiler-modernbert-large-span-5000", "calibrator.joblib")
ckpt = torch.load(ckpt_path, map_location="cpu", weights_only=False)
model = IngestionEncoder(ckpt["backbone"])
model.load_state_dict(ckpt["state_dict"])
model.eval()
tok = AutoTokenizer.from_pretrained(ckpt["backbone"])
calibrator = joblib.load(cal_path)

Use scripts/ingestion_wrapper.py from the source repo to assemble spec-v1 claims from the model output.

For Rust / Python-less deployment

Use the companion repo, which has bf16 safetensors + JSON metadata + native wrapper:

https://huggingface.co/Avifenesh/episodic-ingestion-compiler-rust-bundle

use ingestion_model::{Bundle, IngestionService, Request, WrapperConfig};
let bundle = Bundle::load("ingestion_model_v1")?;
let service = IngestionService::load(bundle, device, WrapperConfig::default())?;
let resp = service.predict(&Request {
    text: "Make sure the build stays under 50 MB.".into(),
    role: Some("user".into()),
    ..Default::default()
})?;

Label spaces (spec-v1)

21 predicates — Class S: validated_by, had_outcome, failed_because, worked_because, decided, blocked_by, has_next_action, has_status, has_goal. Class E: touched_file, ran_command, logged_event, has_constraint, has_open_question, has_quality_finding, reverted_file, deleted_file, created_file, committed, deployed, incident_observed.

13 subject types: objective, command, file, pr, incident, policy, person, repo, team, service, document, ticket, thread.

Deferred (never emitted by the model): has_current_input, has_phase.

Source repo

https://github.com/avifenesh/episodic-ingestion-compiler — see docs/phase-3-results-2026-05-09.md and docs/phase-4-results-2026-05-09.md for full methodology.

Downloads last month: -; Downloads are not tracked for this model. How to track

Inference Providers NEW

This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for Avifenesh/episodic-ingestion-compiler-modernbert-large-span-5000

Base model

answerdotai/ModernBERT-large

Finetuned

(267)

this model

Paper for Avifenesh/episodic-ingestion-compiler-modernbert-large-span-5000

Learning Imbalanced Datasets with Label-Distribution-Aware Margin Loss

Paper • 1906.07413 • Published Jun 18, 2019