Scaling Synthetic Data Creation with 1,000,000,000 Personas Paper • 2406.20094 • Published 11 days ago • 80
4M Models Collection Multimodal models from https://4m.epfl.ch/ • 14 items • Updated 24 days ago • 29
Instruction Pre-Training: Language Models are Supervised Multitask Learners Paper • 2406.14491 • Published 19 days ago • 76
Husky: A Unified, Open-Source Language Agent for Multi-Step Reasoning Paper • 2406.06469 • Published 29 days ago • 22
view article Article GPU Poor Savior: Revolutionizing Low-Bit Open Source LLMs and Cost-Effective Edge Computing By NicoNico • May 25 • 9
view article Article ⚗️ 🧑🏼🌾 Let's grow some Domain Specific Datasets together By burtenshaw • Apr 29 • 27
Phi-3 Collection Phi-3 family of small language and multi-modal models. Language models are available in short- and long-context lengths. • 22 items • Updated May 31 • 362
view article Article Bringing the Artificial Analysis LLM Performance Leaderboard to Hugging Face May 3 • 13
view article Article Run the strongest open-source LLM model: Llama3 70B with just a single 4GB GPU! By lyogavin • Apr 21 • 40
Replacing Judges with Juries: Evaluating LLM Generations with a Panel of Diverse Models Paper • 2404.18796 • Published Apr 29 • 67
view article Article Estimating Memory Consumption of LLMs for Inference and Fine-Tuning for Cohere Command-R+ By Andyrasika • Apr 26 • 7
How Good Are Low-bit Quantized LLaMA3 Models? An Empirical Study Paper • 2404.14047 • Published Apr 22 • 38
Diffusion-RWKV: Scaling RWKV-Like Architectures for Diffusion Models Paper • 2404.04478 • Published Apr 6 • 11
CantTalkAboutThis: Aligning Language Models to Stay on Topic in Dialogues Paper • 2404.03820 • Published Apr 4 • 21
No "Zero-Shot" Without Exponential Data: Pretraining Concept Frequency Determines Multimodal Model Performance Paper • 2404.04125 • Published Apr 4 • 27
MiniGPT4-Video: Advancing Multimodal LLMs for Video Understanding with Interleaved Visual-Textual Tokens Paper • 2404.03413 • Published Apr 4 • 22
Zeroshot Classifiers Collection These are my current best zeroshot classifiers. Some of my older models are downloaded more often, but the models in this collection are newer/better. • 11 items • Updated Apr 3 • 85
DBRX Collection DBRX is a mixture-of-experts (MoE) large language model trained from scratch by Databricks. • 3 items • Updated Mar 27 • 90
Common Corpus Collection The largest public domain dataset for training LLMs. • 27 items • Updated 22 days ago • 107
occiglot-eu5-7b-v0.1 Collection First release of 7B LLMs models for the 5 biggest European languages. All models initialised from mistral-7b-v0.1. • 10 items • Updated Mar 7 • 18
EMO: Emote Portrait Alive - Generating Expressive Portrait Videos with Audio2Video Diffusion Model under Weak Conditions Paper • 2402.17485 • Published Feb 27 • 184
Qwen1.5 Collection Qwen1.5 is the improved version of Qwen, the large language model series developed by Alibaba Cloud. • 55 items • Updated Jun 6 • 201
Spotting LLMs With Binoculars: Zero-Shot Detection of Machine-Generated Text Paper • 2401.12070 • Published Jan 22 • 42
Contrastive Preference Optimization: Pushing the Boundaries of LLM Performance in Machine Translation Paper • 2401.08417 • Published Jan 16 • 28
Self-Play Fine-Tuning Converts Weak Language Models to Strong Language Models Paper • 2401.01335 • Published Jan 2 • 61
Improving Text Embeddings with Large Language Models Paper • 2401.00368 • Published Dec 31, 2023 • 77
Supervised Knowledge Makes Large Language Models Better In-context Learners Paper • 2312.15918 • Published Dec 26, 2023 • 8
WaveCoder: Widespread And Versatile Enhanced Instruction Tuning with Refined Data Generation Paper • 2312.14187 • Published Dec 20, 2023 • 49
MindLLM: Pre-training Lightweight Large Language Model from Scratch, Evaluations and Domain Applications Paper • 2310.15777 • Published Oct 24, 2023 • 2
Beyond Surface: Probing LLaMA Across Scales and Layers Paper • 2312.04333 • Published Dec 7, 2023 • 18
Teaching Language Models to Self-Improve through Interactive Demonstrations Paper • 2310.13522 • Published Oct 20, 2023 • 10
Auto-Instruct: Automatic Instruction Generation and Ranking for Black-Box Language Models Paper • 2310.13127 • Published Oct 19, 2023 • 10
LLM Leaderboard best models ❤️🔥 Collection A daily uploaded list of models with best evaluations on the LLM leaderboard: • 264 items • Updated 16 days ago • 347
smol models Collection Models where the size of the model file (model.safetensors or pytorch_model.bin) < 50mb • 58 items • Updated 6 days ago • 6
Recent models: last 100 repos, sorted by creation date Collection The last 100 repos I have created. Sorted by creation date descending, so the most recently created repos appear at the top. • 121 items • Updated Jan 31 • 464
Learning Interpretable Style Embeddings via Prompting LLMs Paper • 2305.12696 • Published May 22, 2023 • 3
TPTU: Task Planning and Tool Usage of Large Language Model-based AI Agents Paper • 2308.03427 • Published Aug 7, 2023 • 13
Skills-in-Context Prompting: Unlocking Compositionality in Large Language Models Paper • 2308.00304 • Published Aug 1, 2023 • 23
Llama 2: Open Foundation and Fine-Tuned Chat Models Paper • 2307.09288 • Published Jul 18, 2023 • 237
WizardCoder: Empowering Code Large Language Models with Evol-Instruct Paper • 2306.08568 • Published Jun 14, 2023 • 28
Instruction Mining: High-Quality Instruction Data Selection for Large Language Models Paper • 2307.06290 • Published Jul 12, 2023 • 9
Extending Context Window of Large Language Models via Positional Interpolation Paper • 2306.15595 • Published Jun 27, 2023 • 53