view article Article StarCoder2-Instruct: Fully Transparent and Permissive Self-Alignment for Code Generation 10 days ago • 65
Arctic-embed Collection A collection of text embedding models optimized for retrieval accuracy and efficiency • 5 items • Updated 21 days ago • 10
PoSE: Efficient Context Window Extension of LLMs via Positional Skip-wise Training Paper • 2309.10400 • Published Sep 19, 2023 • 21
Differentiable Blocks World: Qualitative 3D Decomposition by Rendering Primitives Paper • 2307.05473 • Published Jul 11, 2023 • 11
Ferret-UI: Grounded Mobile UI Understanding with Multimodal LLMs Paper • 2404.05719 • Published 30 days ago • 55
Prefix-Tuning: Optimizing Continuous Prompts for Generation Paper • 2101.00190 • Published Jan 1, 2021 • 3
ChatGLM-Math: Improving Math Problem-Solving in Large Language Models with a Self-Critique Pipeline Paper • 2404.02893 • Published Apr 3 • 19
InstaFlow: One Step is Enough for High-Quality Diffusion-Based Text-to-Image Generation Paper • 2309.06380 • Published Sep 12, 2023 • 31
DBRX Collection DBRX is a mixture-of-experts (MoE) large language model trained from scratch by Databricks. • 3 items • Updated Mar 27 • 85
LlamaFactory: Unified Efficient Fine-Tuning of 100+ Language Models Paper • 2403.13372 • Published Mar 20 • 50
Common Corpus Collection The largest public domain dataset for training LLMs. • 26 items • Updated Mar 20 • 99
Trending 3D and Depth Demos Collection One place to keep track of all 3D and Depth demos • 14 items • Updated 21 days ago • 16
On the Origin of LLMs: An Evolutionary Tree and Graph for 15,821 Large Language Models Paper • 2307.09793 • Published Jul 19, 2023 • 45
ELLA: Equip Diffusion Models with LLM for Enhanced Semantic Alignment Paper • 2403.05135 • Published Mar 8 • 39
The Era of 1-bit LLMs: All Large Language Models are in 1.58 Bits Paper • 2402.17764 • Published Feb 27 • 563
OpenMath Collection A collection of models and datasets introduced in "OpenMathInstruct-1: A 1.8 Million Math Instruction Tuning Dataset" • 15 items • Updated Feb 19 • 27
HeadStudio: Text to Animatable Head Avatars with 3D Gaussian Splatting Paper • 2402.06149 • Published Feb 9 • 15
DeepSeekMath: Pushing the Limits of Mathematical Reasoning in Open Language Models Paper • 2402.03300 • Published Feb 5 • 61
Recent models: last 100 repos, sorted by creation date Collection The last 100 repos I have created. Sorted by creation date descending, so the most recently created repos appear at the top. • 121 items • Updated Jan 31 • 443
Laughing Hyena Distillery: Extracting Compact Recurrences From Convolutions Paper • 2310.18780 • Published Oct 28, 2023 • 3
Mamba: Linear-Time Sequence Modeling with Selective State Spaces Paper • 2312.00752 • Published Dec 1, 2023 • 130
MindMap: Knowledge Graph Prompting Sparks Graph of Thoughts in Large Language Models Paper • 2308.09729 • Published Aug 17, 2023 • 3
Assorted text-to-image diffusion models Collection This collection contains a list of my most favorite text-to-image diffusion models. • 10 items • Updated 7 days ago • 5
Contra (Bottleneck T5) Collection Text autoencoders capable of embedding and generating text in a fixed-size latent space, useful for embeddings and latent space text editing. • 4 items • Updated Oct 3, 2023 • 27
Efficient Streaming Language Models with Attention Sinks Paper • 2309.17453 • Published Sep 29, 2023 • 13
SCREWS: A Modular Framework for Reasoning with Revisions Paper • 2309.13075 • Published Sep 20, 2023 • 15
LMSYS-Chat-1M: A Large-Scale Real-World LLM Conversation Dataset Paper • 2309.11998 • Published Sep 21, 2023 • 22
FABRIC: Personalizing Diffusion Models with Iterative Feedback Paper • 2307.10159 • Published Jul 19, 2023 • 29
Kosmos-2: Grounding Multimodal Large Language Models to the World Paper • 2306.14824 • Published Jun 26, 2023 • 34
Semantic HELM: An Interpretable Memory for Reinforcement Learning Paper • 2306.09312 • Published Jun 15, 2023 • 2