episodic-ingestion-compiler β€” ModernBERT-large + span + LDAM (Phase 4)

Fine-tuned from answerdotai/ModernBERT-large for the ingestion-model-spec-v1 task in episodic-rs. Produces the information-bearing subset of a claim candidate; a deterministic wrapper (available in Python at scripts/ingestion_wrapper.py OR natively in Rust via episodic-ingestion-compiler-rust-bundle) assembles the full JSON.

Phase 4 = Phase 2 β†’ 3 β†’ 4 pipeline

  • Phase 2A β€” ModernBERT-large + 5 heads (claim_present, predicate, subject_type, confidence, span). Trained 5000 steps on 94k labeled conversational blurbs.
  • Phase 3 β€” expanded training set with autonomous-agent traces (tool_call β†’ tool_result joins), tightened deterministic labeler. Retrained from scratch. decided recall: 0.833 β†’ 0.917.
  • Phase 4 (this) β€” LDAM-DRW head-only fine-tune (Cao et al. 2019, arXiv:1906.07413). Backbone + 4 non-predicate heads frozen; predicate head retrained with label-distribution-aware margin loss for 500 steps (uniform CE for 400 + class-balanced for 100).

Test-set metrics (13,884 held-out blurbs)

Metric P3 v3 P4 LDAM
claim F1 0.962 0.962
claim precision 0.962 0.962
claim recall 0.962 0.962
schema_validity 1.000 1.000
abstention_accuracy 0.976 0.976
predicate_accuracy_macro 0.959 0.966
span_token_IoU 0.969 0.969
confidence MAE 0.022 0.022

Per-predicate recall (P3 v3 β†’ P4 LDAM)

predicate P3 v3 P4 LDAM Ξ”
has_next_action 0.860 0.907 +0.047
has_quality_finding 0.883 0.917 +0.033
has_open_question 0.949 0.959 +0.010
decided 0.917 0.917 Β±
blocked_by 1.000 1.000 Β±
created_file 1.000 1.000 Β±
ran_command 0.997 0.997 Β±
touched_file 0.986 0.986 Β±
has_status 0.982 0.982 Β±
has_goal 0.983 0.978 βˆ’0.005
has_constraint 0.990 0.979 βˆ’0.011

Calibration

Isotonic regression on 3,848 positive cal rows.

  • Base argmax accuracy: 0.9826
  • ECE raw β†’ calibrated: 0.299 β†’ 0.000

Calibrator shipped at calibrator.joblib.

LDAM fine-tune details

  • Trainable params: 0.002% of total (predicate head only: Linear(1024, 21) with bias)
  • LDAM max margin: 0.5 (Ξ”_j ∝ n_j^(-1/4), normalized)
  • DRW phase: class-balanced Effective-Number weights (Ξ²=0.9999) activated at step 400/500
  • Optimizer: AdamW, lr 1e-4, linear warmup 50 steps
  • Elapsed: 132s on RTX 5090
  • Backbone + claim_present + subject_type + confidence + span heads all FROZEN

Loading from Python

import torch, torch.nn as nn
from huggingface_hub import hf_hub_download
from transformers import AutoModel, AutoTokenizer
import joblib

class IngestionEncoder(nn.Module):
    def __init__(self, backbone, load_dtype=torch.float32):
        super().__init__()
        self.backbone = AutoModel.from_pretrained(backbone, torch_dtype=load_dtype)
        h = self.backbone.config.hidden_size
        self.dropout = nn.Dropout(0.1)
        self.head_claim_present = nn.Linear(h, 1)
        self.head_predicate = nn.Linear(h, 21)
        self.head_subject_type = nn.Linear(h, 13)
        self.head_confidence = nn.Linear(h, 1)
        self.head_span = nn.Linear(h, 2)

    def forward(self, input_ids, attention_mask):
        out = self.backbone(input_ids=input_ids, attention_mask=attention_mask)
        h = out.last_hidden_state
        pooled = self.dropout(h[:, 0])
        span_logits = self.head_span(self.dropout(h))
        s_log, e_log = span_logits.split(1, dim=-1)
        very_neg = torch.finfo(s_log.dtype).min
        mask = (attention_mask == 0)
        return {
            "claim_logit": self.head_claim_present(pooled).squeeze(-1),
            "predicate_logits": self.head_predicate(pooled),
            "subject_type_logits": self.head_subject_type(pooled),
            "confidence_pred": torch.sigmoid(self.head_confidence(pooled).squeeze(-1)),
            "span_start_logits": s_log.squeeze(-1).masked_fill(mask, very_neg),
            "span_end_logits": e_log.squeeze(-1).masked_fill(mask, very_neg),
        }

ckpt_path = hf_hub_download("Avifenesh/episodic-ingestion-compiler-modernbert-large-span-5000", "best.pt")
cal_path = hf_hub_download("Avifenesh/episodic-ingestion-compiler-modernbert-large-span-5000", "calibrator.joblib")
ckpt = torch.load(ckpt_path, map_location="cpu", weights_only=False)
model = IngestionEncoder(ckpt["backbone"])
model.load_state_dict(ckpt["state_dict"])
model.eval()
tok = AutoTokenizer.from_pretrained(ckpt["backbone"])
calibrator = joblib.load(cal_path)

Use scripts/ingestion_wrapper.py from the source repo to assemble spec-v1 claims from the model output.

For Rust / Python-less deployment

Use the companion repo, which has bf16 safetensors + JSON metadata + native wrapper:

https://huggingface.co/Avifenesh/episodic-ingestion-compiler-rust-bundle

use ingestion_model::{Bundle, IngestionService, Request, WrapperConfig};
let bundle = Bundle::load("ingestion_model_v1")?;
let service = IngestionService::load(bundle, device, WrapperConfig::default())?;
let resp = service.predict(&Request {
    text: "Make sure the build stays under 50 MB.".into(),
    role: Some("user".into()),
    ..Default::default()
})?;

Label spaces (spec-v1)

21 predicates β€” Class S: validated_by, had_outcome, failed_because, worked_because, decided, blocked_by, has_next_action, has_status, has_goal. Class E: touched_file, ran_command, logged_event, has_constraint, has_open_question, has_quality_finding, reverted_file, deleted_file, created_file, committed, deployed, incident_observed.

13 subject types: objective, command, file, pr, incident, policy, person, repo, team, service, document, ticket, thread.

Deferred (never emitted by the model): has_current_input, has_phase.

Source repo

https://github.com/avifenesh/episodic-ingestion-compiler β€” see docs/phase-3-results-2026-05-09.md and docs/phase-4-results-2026-05-09.md for full methodology.

Downloads last month

-

Downloads are not tracked for this model. How to track
Inference Providers NEW
This model isn't deployed by any Inference Provider. πŸ™‹ Ask for provider support

Model tree for Avifenesh/episodic-ingestion-compiler-modernbert-large-span-5000

Finetuned
(267)
this model

Paper for Avifenesh/episodic-ingestion-compiler-modernbert-large-span-5000