Magenta RealTime 2 — AOTInductor step graphs (base)
Prebuilt AOTInductor
(weight-less) compiled graphs for the per-frame generation step of
magenta-torch/magenta-realtime-2.
They let the model run faster-than-real-time without calling torch.compile at
runtime (e.g. on ZeroGPU, where torch.compile is unavailable).
⚠️ Hardware-specific
AOTI artifacts are compiled for a specific GPU architecture — these were built for
NVIDIA RTX 6000 PRO (Blackwell, sm_120), the ZeroGPU architecture. They will not
load on other GPUs (A100, H100, L4, T4, consumer cards, …).
On any other GPU, don't use these. Instead:
model = AutoModel.from_pretrained("magenta-torch/magenta-realtime-2",
trust_remote_code=True, dtype=torch.bfloat16).to("cuda")
model.compile_steps() # portable torch.compile — works on any CUDA GPU
Or export your own AOTI graphs for your architecture from the per-frame step
(MagentaRT2ForConditionalGeneration.depthformer.decoder) — see the converter/compile
utilities in the dev repo
(fork).
Use (only on matching hardware)
model.load_compiled("magenta-torch/magenta-rt-aoti-base") # binds the weight-less graphs to the model's weights
Contains temporal.pt2 and depth.pt2 (the two hot step graphs) + metadata.
Model tree for magenta-community/magenta-rt-aoti-base
Base model
google/magenta-realtime-2