-
Beyond Chinchilla-Optimal: Accounting for Inference in Language Model Scaling Laws
Paper • 2401.00448 • Published • 28 -
Improving Text Embeddings with Large Language Models
Paper • 2401.00368 • Published • 79 -
E^2-LLM: Efficient and Extreme Length Extension of Large Language Models
Paper • 2401.06951 • Published • 25 -
The Unreasonable Ineffectiveness of the Deeper Layers
Paper • 2403.17887 • Published • 78
yizhou shan
lastweek
AI & ML interests
None yet
Recent Activity
upvoted
a
paper
13 days ago
Byte Latent Transformer: Patches Scale Better Than Tokens
Organizations
None yet
Collections
3
-
Self-Rewarding Language Models
Paper • 2401.10020 • Published • 145 -
Advances in 3D Generation: A Survey
Paper • 2401.17807 • Published • 17 -
Efficient Tool Use with Chain-of-Abstraction Reasoning
Paper • 2401.17464 • Published • 16 -
MM1: Methods, Analysis & Insights from Multimodal LLM Pre-training
Paper • 2403.09611 • Published • 125
Papers
2
models
None public yet
datasets
None public yet