DITTO: Diffusion Inference-Time T-Optimization for Music Generation Paper • 2401.12179 • Published Jan 22 • 19
Beyond Chinchilla-Optimal: Accounting for Inference in Language Model Scaling Laws Paper • 2401.00448 • Published Dec 31, 2023 • 28
DeepSpeed Ulysses: System Optimizations for Enabling Training of Extreme Long Sequence Transformer Models Paper • 2309.14509 • Published Sep 25, 2023 • 17
LongLoRA: Efficient Fine-tuning of Long-Context Large Language Models Paper • 2309.12307 • Published Sep 21, 2023 • 87
Boolformer: Symbolic Regression of Logic Functions with Transformers Paper • 2309.12207 • Published Sep 21, 2023 • 11
LMDX: Language Model-based Document Information Extraction and Localization Paper • 2309.10952 • Published Sep 19, 2023 • 65
Cure the headache of Transformers via Collinear Constrained Attention Paper • 2309.08646 • Published Sep 15, 2023 • 12
Neurons in Large Language Models: Dead, N-gram, Positional Paper • 2309.04827 • Published Sep 9, 2023 • 16
ModelScope-Agent: Building Your Customizable Agent System with Open-source Large Language Models Paper • 2309.00986 • Published Sep 2, 2023 • 17
PromptTTS 2: Describing and Generating Voices with Text Prompt Paper • 2309.02285 • Published Sep 5, 2023 • 11
FACET: Fairness in Computer Vision Evaluation Benchmark Paper • 2309.00035 • Published Aug 31, 2023 • 16