--- license: apache-2.0 --- Pretrain and finetune weights for CVPR'25 Paper "Crab: A Unified Audio-Visual Scene Understanding Model with Explicit Cooperation"