Graph Neural Networks for Learning Equivariant Representations of Neural Networks Paper • 2403.12143 • Published Mar 18
Accelerating Training with Neuron Interaction and Nowcasting Networks Paper • 2409.04434 • Published Sep 6
Can We Scale Transformers to Predict Parameters of Diverse ImageNet Models? Paper • 2303.04143 • Published Mar 7, 2023
$μ$LO: Compute-Efficient Meta-Generalization of Learned Optimizers Paper • 2406.00153 • Published May 31 • 10
LoGAH: Predicting 774-Million-Parameter Transformers using Graph HyperNetworks with 1/100 Parameters Paper • 2405.16287 • Published May 25 • 10
LoGAH: Predicting 774-Million-Parameter Transformers using Graph HyperNetworks with 1/100 Parameters Paper • 2405.16287 • Published May 25 • 10
Unlocking Slot Attention by Changing Optimal Transport Costs Paper • 2301.13197 • Published Jan 30, 2023
Unlocking Slot Attention by Changing Optimal Transport Costs Paper • 2301.13197 • Published Jan 30, 2023
Promoting Exploration in Memory-Augmented Adam using Critical Momenta Paper • 2307.09638 • Published Jul 18, 2023 • 2