feat: implement hardware-adaptive compute bounding and dynamic entropy routing (Eqs. 3-4)

by dataopsnick - opened 6 days ago

base: refs/heads/main

←

from: refs/pr/2

Discussion Files changed

+37

-10

dataopsnick

Owner 6 days ago

Context & Motivation
This PR aligns the ADAPT-DIFF pipeline implementation with the claims made in Section 2.3 of the latest manuscript draft ("Hardware-Adaptive Bounding"). Previously, the token refinement stage relied on a hardcoded static threshold (entropy_threshold=1.5), which lacked true hardware adaptability. This update introduces a dynamic compute-budget router that strictly enforces target FLOP constraints on the fly.

Key Changes

Replaced LogitUncertaintyFilter with HardwareAdaptiveRouter: The routing module now accepts relative compute costs for base block generations (c_base) and bfloat16 refinements (c_bf16).
Dynamic Budgeting (Equation 3): The router now calculates the maximum permissible number of tokens to refine in bfloat16 based on an active computational ceiling ($C_{step} \le C_{target}$).
Infimum Thresholding (Equation 4): Calculates dynamic_tau ($\tau$) on a per-step basis by sorting token uncertainty (LogTokU) and strictly bounding the masking threshold to the allowed hardware budget.
Pipeline Integration: Updated ADAPTDIFFPipeline to accept target_budget instead of a static float, allowing downstream deployment to dynamically throttle or increase token refinement depth based on live GPU/system load.

Impact & Validation
These changes fully close the gap between the theoretical manuscript and the code. By establishing a mathematically sound and dynamically shifting $\tau$, this PR directly validates the paper's claim of providing a "Pareto-optimal approach for LLM inference" that can trade off FLOPs and task accuracy adaptively.

Reviewer Notes

The proxy FLOP cost defaults are currently set to c_base=1.0 and c_bf16=5.0 for normalized tracking. These can be adjusted to hardware-specific latency metrics if profiled.
Ensure downstream inference scripts are updated to pass target_budget instead of entropy_threshold.

feat: implement hardware-adaptive compute bounding and dynamic entropy routing (Eqs. 3-4)cfdc4674

dataopsnick

Owner 6 days ago

I was able to find several mistakes in the original code implementation of the ADAPT-DIFF paper using https://loopmaxxer.review "Preflight Check"

I'm updating the code and preprint manuscript to bring them into alignment until I have a fully working implementation (e.g. a proper latent diffusion process vs a multi-token generator head) of the original ADAPT-DIFF preprint specification.

dataopsnick changed pull request status to merged 6 days ago

dataopsnick deleted the refs/pr/2 ref 6 days ago

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment