SemantiCodec: An Ultra Low Bitrate Semantic Audio Codec for General Sound Paper • 2405.00233 • Published 18 days ago • 12
Self-Play Preference Optimization for Language Model Alignment Paper • 2405.00675 • Published 17 days ago • 18
LLM-AD: Large Language Model based Audio Description System Paper • 2405.00983 • Published 17 days ago • 13
FLAME: Factuality-Aware Alignment for Large Language Models Paper • 2405.01525 • Published 16 days ago • 21
Granite Code Models Collection A series of code models trained by IBM licensed under Apache 2.0 license. We release both the base pretrained and instruct models. • 10 items • Updated 6 days ago • 117
The Instruction Hierarchy: Training LLMs to Prioritize Privileged Instructions Paper • 2404.13208 • Published 29 days ago • 37
StoryDiffusion: Consistent Self-Attention for Long-Range Image and Video Generation Paper • 2405.01434 • Published 16 days ago • 44
Prometheus 2: An Open Source Language Model Specialized in Evaluating Other Language Models Paper • 2405.01535 • Published 16 days ago • 92
GreenBitAI MLX LLM Collection GreenBitAI's Low-bit LLMs in MLX format • 69 items • Updated 12 days ago • 4
BGE M3-Embedding: Multi-Lingual, Multi-Functionality, Multi-Granularity Text Embeddings Through Self-Knowledge Distillation Paper • 2402.03216 • Published Feb 5 • 2
Flexibly Scaling Large Language Models Contexts Through Extensible Tokenization Paper • 2401.07793 • Published Jan 15 • 3
INTERS: Unlocking the Power of Large Language Models in Search with Instruction Tuning Paper • 2401.06532 • Published Jan 12 • 10
Better & Faster Large Language Models via Multi-token Prediction Paper • 2404.19737 • Published 18 days ago • 61
Nomic Embed: Training a Reproducible Long Context Text Embedder Paper • 2402.01613 • Published Feb 2 • 13
🦢SWIM-IR Dataset Collection 29 million Synthetic Wikipedia-based Multilingual Retrieval Training Pairs. • 4 items • Updated 20 days ago • 6
view article Article 🦙⚗️ Using Llama3 and distilabel to build fine-tuning datasets By dvilasuero • 22 days ago • 54
Phi-3 Technical Report: A Highly Capable Language Model Locally on Your Phone Paper • 2404.14219 • Published 26 days ago • 230
Eurus Collection Advancing LLM Reasoning Generalists with Preference Trees • 11 items • Updated Apr 15 • 22
Awesome SFT datasets Collection A curated list of interesting datasets to fine-tune language models with. • 43 items • Updated Apr 12 • 89
Meta Llama 3 Collection This collection hosts the transformers and original repos of the Meta Llama 3 and Llama Guard 2 releases • 5 items • Updated about 1 month ago • 522
CantTalkAboutThis: Aligning Language Models to Stay on Topic in Dialogues Paper • 2404.03820 • Published Apr 4 • 20
Audio Dialogues: Dialogues dataset for audio and music understanding Paper • 2404.07616 • Published Apr 11 • 14
Megalodon: Efficient LLM Pretraining and Inference with Unlimited Context Length Paper • 2404.08801 • Published Apr 12 • 61
Leaderboards and benchmarks ✨ Collection Cool leaderboard spaces collection for models across modalities! Text, vision, audio, ... • 61 items • Updated 4 days ago • 59
LLM2Vec: Large Language Models Are Secretly Powerful Text Encoders Paper • 2404.05961 • Published Apr 9 • 62
Direct Nash Optimization: Teaching Language Models to Self-Improve with General Preferences Paper • 2404.03715 • Published Apr 4 • 57
No "Zero-Shot" Without Exponential Data: Pretraining Concept Frequency Determines Multimodal Model Performance Paper • 2404.04125 • Published Apr 4 • 26
Leave No Context Behind: Efficient Infinite Context Transformers with Infini-attention Paper • 2404.07143 • Published Apr 10 • 92
Model Tells You What to Discard: Adaptive KV Cache Compression for LLMs Paper • 2310.01801 • Published Oct 3, 2023 • 3
Teaching Large Language Models to Reason with Reinforcement Learning Paper • 2403.04642 • Published Mar 7 • 43
LLM2LLM: Boosting LLMs with Novel Iterative Data Enhancement Paper • 2403.15042 • Published Mar 22 • 24
Larimar: Large Language Models with Episodic Memory Control Paper • 2403.11901 • Published Mar 18 • 30
Language Models as Compilers: Simulating Pseudocode Execution Improves Algorithmic Reasoning in Language Models Paper • 2404.02575 • Published Apr 3 • 46
What Makes Good Data for Alignment? A Comprehensive Study of Automatic Data Selection in Instruction Tuning Paper • 2312.15685 • Published Dec 25, 2023 • 16
Zeroshot Classifiers Collection These are my current best zeroshot classifiers. Some of my older models are downloaded more often, but the models in this collection are newer/better. • 11 items • Updated Apr 3 • 76