Megalodon: Efficient LLM Pretraining and Inference with Unlimited Context Length Paper • 2404.08801 • Published 19 days ago • 57
Set-of-Mark Prompting Unleashes Extraordinary Visual Grounding in GPT-4V Paper • 2310.11441 • Published Oct 17, 2023 • 24
WebVoyager: Building an End-to-End Web Agent with Large Multimodal Models Paper • 2401.13919 • Published Jan 25 • 21
Sora参考论文 Collection OpenAI "Video generation models as world simulators"技术报告后面的参考论文,总共32篇。OpenAI的ImageGPT和Dalle3这两篇缺失,链接已补充到note中。 • 32 items • Updated Feb 18 • 53
Divide and Conquer: Language Models can Plan and Self-Correct for Compositional Text-to-Image Generation Paper • 2401.15688 • Published Jan 28 • 10
DREAM: Diffusion Rectification and Estimation-Adaptive Models Paper • 2312.00210 • Published Nov 30, 2023 • 14
FSGS: Real-Time Few-shot View Synthesis using Gaussian Splatting Paper • 2312.00451 • Published Dec 1, 2023 • 8
Concept Sliders: LoRA Adaptors for Precise Control in Diffusion Models Paper • 2311.12092 • Published Nov 20, 2023 • 19
Story-to-Motion: Synthesizing Infinite and Controllable Character Animation from Long Text Paper • 2311.07446 • Published Nov 13, 2023 • 27
CodePlan: Repository-level Coding using LLMs and Planning Paper • 2309.12499 • Published Sep 21, 2023 • 68
A Paradigm Shift in Machine Translation: Boosting Translation Performance of Large Language Models Paper • 2309.11674 • Published Sep 20, 2023 • 29
Sorted LLaMA: Unlocking the Potential of Intermediate Layers of Large Language Models for Dynamic Inference Using Sorted Fine-Tuning (SoFT) Paper • 2309.08968 • Published Sep 16, 2023 • 22
CulturaX: A Cleaned, Enormous, and Multilingual Dataset for Large Language Models in 167 Languages Paper • 2309.09400 • Published Sep 17, 2023 • 77
Contrastive Decoding Improves Reasoning in Large Language Models Paper • 2309.09117 • Published Sep 17, 2023 • 37