FIFO-Diffusion: Generating Infinite Videos from Text without Training Paper • 2405.11473 • Published 4 days ago • 42 • 8
MoRA: High-Rank Updating for Parameter-Efficient Fine-Tuning Paper • 2405.12130 • Published 3 days ago • 35 • 7
Chameleon: Mixed-Modal Early-Fusion Foundation Models Paper • 2405.09818 • Published 7 days ago • 82 • 7
Xmodel-VLM: A Simple Baseline for Multimodal Vision Language Model Paper • 2405.09215 • Published 8 days ago • 13 • 1
ALPINE: Unveiling the Planning Capability of Autoregressive Learning in Language Models Paper • 2405.09220 • Published 8 days ago • 22 • 1
Compositional Text-to-Image Generation with Dense Blob Representations Paper • 2405.08246 • Published 10 days ago • 11 • 1
SUTRA: Scalable Multilingual Language Model Architecture Paper • 2405.06694 • Published 16 days ago • 34 • 2
RLHF Workflow: From Reward Modeling to Online RLHF Paper • 2405.07863 • Published 10 days ago • 55 • 5
What matters when building vision-language models? Paper • 2405.02246 • Published 20 days ago • 85 • 1
LoRA Land: 310 Fine-tuned LLMs that Rival GPT-4, A Technical Report Paper • 2405.00732 • Published 24 days ago • 110 • 9
Prometheus 2: An Open Source Language Model Specialized in Evaluating Other Language Models Paper • 2405.01535 • Published 21 days ago • 96 • 11
STT: Stateful Tracking with Transformers for Autonomous Driving Paper • 2405.00236 • Published 23 days ago • 7 • 2
Paint by Inpaint: Learning to Add Image Objects by Removing Them First Paper • 2404.18212 • Published 25 days ago • 19 • 4
Revisiting Text-to-Image Evaluation with Gecko: On Metrics, Prompts, and Human Ratings Paper • 2404.16820 • Published 28 days ago • 15 • 2
Layer Skip: Enabling Early Exit Inference and Self-Speculative Decoding Paper • 2404.16710 • Published 28 days ago • 55 • 8
How Far Are We to GPT-4V? Closing the Gap to Commercial Multimodal Models with Open-Source Suites Paper • 2404.16821 • Published 28 days ago • 49 • 4
OpenELM: An Efficient Language Model Family with Open-source Training and Inference Framework Paper • 2404.14619 • Published about 1 month ago • 120 • 13
The Instruction Hierarchy: Training LLMs to Prioritize Privileged Instructions Paper • 2404.13208 • Published Apr 19 • 37 • 8
Phi-3 Technical Report: A Highly Capable Language Model Locally on Your Phone Paper • 2404.14219 • Published Apr 22 • 235 • 40
DreamScene360: Unconstrained Text-to-3D Scene Generation with Panoramic Gaussian Splatting Paper • 2404.06903 • Published Apr 10 • 14 • 2
MM1: Methods, Analysis & Insights from Multimodal LLM Pre-training Paper • 2403.09611 • Published Mar 14 • 119 • 10
StableDrag: Stable Dragging for Point-based Image Editing Paper • 2403.04437 • Published Mar 7 • 23 • 3
Panda-70M: Captioning 70M Videos with Multiple Cross-Modality Teachers Paper • 2402.19479 • Published Feb 29 • 30 • 2
CodePlan: Repository-level Coding using LLMs and Planning Paper • 2309.12499 • Published Sep 21, 2023 • 68 • 14
SuGaR: Surface-Aligned Gaussian Splatting for Efficient 3D Mesh Reconstruction and High-Quality Mesh Rendering Paper • 2311.12775 • Published Nov 21, 2023 • 28 • 2
GPT-4V in Wonderland: Large Multimodal Models for Zero-Shot Smartphone GUI Navigation Paper • 2311.07562 • Published Nov 13, 2023 • 11 • 1
RoboGen: Towards Unleashing Infinite Data for Automated Robot Learning via Generative Simulation Paper • 2311.01455 • Published Nov 2, 2023 • 25 • 2
HyperFields: Towards Zero-Shot Generation of NeRFs from Text Paper • 2310.17075 • Published Oct 26, 2023 • 13 • 2
3D-GPT: Procedural 3D Modeling with Large Language Models Paper • 2310.12945 • Published Oct 19, 2023 • 52 • 2
Self-RAG: Learning to Retrieve, Generate, and Critique through Self-Reflection Paper • 2310.11511 • Published Oct 17, 2023 • 62 • 2
Can GPT models be Financial Analysts? An Evaluation of ChatGPT and GPT-4 on mock CFA Exams Paper • 2310.08678 • Published Oct 12, 2023 • 11 • 3
Table-GPT: Table-tuned GPT for Diverse Table Tasks Paper • 2310.09263 • Published Oct 13, 2023 • 36 • 11
Lemur: Harmonizing Natural Language and Code for Language Agents Paper • 2310.06830 • Published Oct 10, 2023 • 29 • 2
MathCoder: Seamless Code Integration in LLMs for Enhanced Mathematical Reasoning Paper • 2310.03731 • Published Oct 5, 2023 • 25 • 3
Decoding speech from non-invasive brain recordings Paper • 2208.12266 • Published Aug 25, 2022 • 4 • 1
Large Language Models Cannot Self-Correct Reasoning Yet Paper • 2310.01798 • Published Oct 3, 2023 • 30 • 2
Enable Language Models to Implicitly Learn Self-Improvement From Data Paper • 2310.00898 • Published Oct 2, 2023 • 21 • 1
Show-1: Marrying Pixel and Latent Diffusion Models for Text-to-Video Generation Paper • 2309.15818 • Published Sep 27, 2023 • 18 • 4
VideoDirectorGPT: Consistent Multi-scene Video Generation via LLM-Guided Planning Paper • 2309.15091 • Published Sep 26, 2023 • 31 • 4
Exploring Large Language Models' Cognitive Moral Development through Defining Issues Test Paper • 2309.13356 • Published Sep 23, 2023 • 36 • 4
Small-scale proxies for large-scale Transformer training instabilities Paper • 2309.14322 • Published Sep 25, 2023 • 17 • 2
GPT4Tools: Teaching Large Language Model to Use Tools via Self-instruction Paper • 2305.18752 • Published May 30, 2023 • 2 • 1
A Paradigm Shift in Machine Translation: Boosting Translation Performance of Large Language Models Paper • 2309.11674 • Published Sep 20, 2023 • 29 • 2
LongLoRA: Efficient Fine-tuning of Long-Context Large Language Models Paper • 2309.12307 • Published Sep 21, 2023 • 82 • 8
DreamLLM: Synergistic Multimodal Comprehension and Creation Paper • 2309.11499 • Published Sep 20, 2023 • 57 • 5
CulturaX: A Cleaned, Enormous, and Multilingual Dataset for Large Language Models in 167 Languages Paper • 2309.09400 • Published Sep 17, 2023 • 77 • 4