Collections
Discover the best community collections!
Collections including paper arxiv:2307.08621
-
The Impact of Depth and Width on Transformer Language Model Generalization
Paper • 2310.19956 • Published • 9 -
Retentive Network: A Successor to Transformer for Large Language Models
Paper • 2307.08621 • Published • 170 -
RWKV: Reinventing RNNs for the Transformer Era
Paper • 2305.13048 • Published • 15 -
Attention Is All You Need
Paper • 1706.03762 • Published • 49
-
Detecting Pretraining Data from Large Language Models
Paper • 2310.16789 • Published • 10 -
Let's Synthesize Step by Step: Iterative Dataset Synthesis with Large Language Models by Extrapolating Errors from Small Models
Paper • 2310.13671 • Published • 18 -
AutoMix: Automatically Mixing Language Models
Paper • 2310.12963 • Published • 14 -
An Emulator for Fine-Tuning Large Language Models using Small Language Models
Paper • 2310.12962 • Published • 14
-
SIMPL: A Simple and Efficient Multi-agent Motion Prediction Baseline for Autonomous Driving
Paper • 2402.02519 • Published -
Mixtral of Experts
Paper • 2401.04088 • Published • 158 -
Optimal Transport Aggregation for Visual Place Recognition
Paper • 2311.15937 • Published -
GOAT: GO to Any Thing
Paper • 2311.06430 • Published • 14