Models
Datasets
Spaces
Posts
Docs
Enterprise
Pricing
Log In
Sign Up

Collections

Discover the best community collections!

Collections including paper arxiv:2402.08609

Model Training - Learning Scheme

Instruction Pre-Training: Language Models are Supervised Multitask Learners

Paper • 2406.14491 • Published Jun 20, 2024 • 91
Pre-training Small Base LMs with Fewer Tokens

Paper • 2404.08634 • Published Apr 12, 2024 • 35
Stacking Your Transformers: A Closer Look at Model Growth for Efficient LLM Pre-Training

Paper • 2405.15319 • Published May 24, 2024 • 29
Can LLMs Learn by Teaching? A Preliminary Study

Paper • 2406.14629 • Published Jun 20, 2024 • 20

Reinforcement Learning (RL / RLHF)

RLHF Workflow: From Reward Modeling to Online RLHF

Paper • 2405.07863 • Published May 13, 2024 • 70
Understanding and Diagnosing Deep Reinforcement Learning

Paper • 2406.16979 • Published Jun 23, 2024 • 9
Direct Nash Optimization: Teaching Language Models to Self-Improve with General Preferences

Paper • 2404.03715 • Published Apr 4, 2024 • 61
Iterative Nash Policy Optimization: Aligning LLMs with General Preferences via No-Regret Learning

Paper • 2407.00617 • Published Jun 30, 2024 • 7

Synthesis Script Generation

Mixtures of Experts Unlock Parameter Scaling for Deep RL

Paper • 2402.08609 • Published Feb 13, 2024 • 36

ORPO: Monolithic Preference Optimization without Reference Model

Paper • 2403.07691 • Published Mar 12, 2024 • 65
sDPO: Don't Use Your Data All at Once

Paper • 2403.19270 • Published Mar 28, 2024 • 41
Teaching Large Language Models to Reason with Reinforcement Learning

Paper • 2403.04642 • Published Mar 7, 2024 • 48
Best Practices and Lessons Learned on Synthetic Data for Language Models

Paper • 2404.07503 • Published Apr 11, 2024 • 30

Mixtures of Experts Unlock Parameter Scaling for Deep RL

Paper • 2402.08609 • Published Feb 13, 2024 • 36

Mixture of Experts

BlackMamba: Mixture of Experts for State-Space Models

Paper • 2402.01771 • Published Feb 1, 2024 • 25
OpenMoE: An Early Effort on Open Mixture-of-Experts Language Models

Paper • 2402.01739 • Published Jan 29, 2024 • 27
MoE-LLaVA: Mixture of Experts for Large Vision-Language Models

Paper • 2401.15947 • Published Jan 29, 2024 • 51
DeepSeekMoE: Towards Ultimate Expert Specialization in Mixture-of-Experts Language Models

Paper • 2401.06066 • Published Jan 11, 2024 • 53

Mixtures of Experts Unlock Parameter Scaling for Deep RL

Paper • 2402.08609 • Published Feb 13, 2024 • 36

Real-World Fluid Directed Rigid Body Control via Deep Reinforcement Learning

Paper • 2402.06102 • Published Feb 8, 2024 • 6
Mixtures of Experts Unlock Parameter Scaling for Deep RL

Paper • 2402.08609 • Published Feb 13, 2024 • 36
In deep reinforcement learning, a pruned network is a good network

Paper • 2402.12479 • Published Feb 19, 2024 • 19
Beyond A*: Better Planning with Transformers via Search Dynamics Bootstrapping

Paper • 2402.14083 • Published Feb 21, 2024 • 48

Language Models

Exponentially Faster Language Modelling

Paper • 2311.10770 • Published Nov 15, 2023 • 119
stabilityai/stable-video-diffusion-img2vid-xt

Image-to-Video • Updated Jul 10, 2024 • 449k • 2.97k
LucidDreamer: Domain-free Generation of 3D Gaussian Splatting Scenes

Paper • 2311.13384 • Published Nov 22, 2023 • 53
HierSpeech++: Bridging the Gap between Semantic and Acoustic Representation of Speech by Hierarchical Variational Inference for Zero-shot Speech Synthesis

Paper • 2311.12454 • Published Nov 21, 2023 • 31

UFOGen: You Forward Once Large Scale Text-to-Image Generation via Diffusion GANs

Paper • 2311.09257 • Published Nov 14, 2023 • 48
Latent Consistency Models: Synthesizing High-Resolution Images with Few-Step Inference

Paper • 2310.04378 • Published Oct 6, 2023 • 20
QA-LoRA: Quantization-Aware Low-Rank Adaptation of Large Language Models

Paper • 2309.14717 • Published Sep 26, 2023 • 44
Exponentially Faster Language Modelling

Paper • 2311.10770 • Published Nov 15, 2023 • 119

Previous
1
2
Next

Company

TOS Privacy About Jobs

Website

Models Datasets Spaces Pricing Docs