Nearest Neighbor Speculative Decoding for LLM Generation and Attribution Paper β’ 2405.19325 β’ Published 3 days ago β’ 10
Self-Exploring Language Models: Active Preference Elicitation for Online Alignment Paper β’ 2405.19332 β’ Published 3 days ago β’ 9
Value-Incentivized Preference Optimization: A Unified Approach to Online and Offline RLHF Paper β’ 2405.19320 β’ Published 3 days ago β’ 6
Zipper: A Multi-Tower Decoder Architecture for Fusing Modalities Paper β’ 2405.18669 β’ Published 4 days ago β’ 9
Xwin-LM: Strong and Scalable Alignment Practice for LLMs Paper β’ 2405.20335 β’ Published 2 days ago β’ 10
MAP-Neo: Highly Capable and Transparent Bilingual Large Language Model Series Paper β’ 2405.19327 β’ Published 3 days ago β’ 34
Transformers Can Do Arithmetic with the Right Embeddings Paper β’ 2405.17399 β’ Published 5 days ago β’ 44
List Items One by One: A New Data Source and Learning Paradigm for Multimodal LLMs Paper β’ 2404.16375 β’ Published Apr 25 β’ 16
view article Article Decoding GPT-4'o': In-Depth Exploration of Its Mechanisms and Creating Similar AI. By KingNish β’ 11 days ago β’ 23
Critique Models (CM) on the π€ Hub Collection This collection contains some Critique Models (CM) for LLM evaluation available in the HuggingFace Hub β’ 5 items β’ Updated 25 days ago β’ 3
view article Article PaliGemma β Google's Cutting-Edge Open Vision Language Model 19 days ago β’ 131
CommonCanvas Collection Collection of models trained on the CommonCatalogue datasets β’ 8 items β’ Updated 16 days ago β’ 6
MAmmoTH2 Collection Scaling up instruction data from the web for to build better LLMs β’ 11 items β’ Updated 6 days ago β’ 6
Blackhole Collection A black hole with lots of high-quality dialogue datasets in many fields, and multilingual helps to train LLMs with SFT and DPO methods easier. β’ 32 items β’ Updated 8 days ago β’ 6
SpeechVerse: A Large-scale Generalizable Audio Language Model Paper β’ 2405.08295 β’ Published 19 days ago β’ 10
SpeechGuard: Exploring the Adversarial Robustness of Multimodal Large Language Models Paper β’ 2405.08317 β’ Published 19 days ago β’ 8
What matters when building vision-language models? Paper β’ 2405.02246 β’ Published 29 days ago β’ 87
No Time to Waste: Squeeze Time into Channel for Mobile Video Understanding Paper β’ 2405.08344 β’ Published 19 days ago β’ 10
Coin3D: Controllable and Interactive 3D Assets Generation with Proxy-Guided Conditioning Paper β’ 2405.08054 β’ Published 19 days ago β’ 19
BEHAVIOR Vision Suite: Customizable Dataset Generation via Simulation Paper β’ 2405.09546 β’ Published 17 days ago β’ 9
Overcoming the Pitfalls of Vision-Language Model Finetuning for OOD Generalization Paper β’ 2401.15914 β’ Published Jan 29 β’ 7
Chronos Models Collection Chronos: Pretrained (language) models for time series forecasting based on the T5 architecture. β’ 6 items β’ Updated Mar 18 β’ 25
π¦ 3D creation workflow Collection Going from a text prompt to a nice 3D model β’ 3 items β’ Updated Feb 6 β’ 23
π Stable Diffusion LoRAs Collection Awesome LoRAs found on the hub - using only π΅ β’ 7 items β’ Updated Feb 6 β’ 14
view article Article Train custom AI models with the trainer API and adapt them to π€ By not-lain β’ 7 days ago β’ 21
Transferable and Principled Efficiency for Open-Vocabulary Segmentation Paper β’ 2404.07448 β’ Published Apr 11 β’ 10
DOCCI: Descriptions of Connected and Contrasting Images Paper β’ 2404.19753 β’ Published Apr 30 β’ 9
Transcription Collection Transcribe interviews for free with Whisper in Spaces. β’ 5 items β’ Updated Apr 23 β’ 3
view article Article Preference Tuning LLMs with Direct Preference Optimization Methods Jan 18 β’ 20
Model Merging Collection Model Merging is a very popular technique nowadays in LLM. Here is a chronological list of papers on the space that will help you get started with it! β’ 29 items β’ Updated 2 days ago β’ 181
OpenMathInstruct-1: A 1.8 Million Math Instruction Tuning Dataset Paper β’ 2402.10176 β’ Published Feb 15 β’ 33
Functional Benchmarks for Robust Evaluation of Reasoning Performance, and the Reasoning Gap Paper β’ 2402.19450 β’ Published Feb 29 β’ 3
You Only Cache Once: Decoder-Decoder Architectures for Language Models Paper β’ 2405.05254 β’ Published 24 days ago β’ 8
Gemma: Open Models Based on Gemini Research and Technology Paper β’ 2403.08295 β’ Published Mar 13 β’ 43
AdvPrompter: Fast Adaptive Adversarial Prompting for LLMs Paper β’ 2404.16873 β’ Published Apr 21 β’ 26
view article Article Bringing the Artificial Analysis LLM Performance Leaderboard to Hugging Face 30 days ago β’ 13
view article Article βοΈ π§πΌβπΎ Let's grow some Domain Specific Datasets together By burtenshaw β’ Apr 29 β’ 27
view article Article Building Cost-Efficient Enterprise RAG applications with Intel Gaudi 2 and Intel Xeon 24 days ago β’ 7
Aya Datasets Collection The Aya Collection is a massive multilingual collection for over 100 languages consisting of 513 million instances of prompts and completions. β’ 4 items β’ Updated 9 days ago β’ 9
C4AI Command R Collection C4AI Command-R is a research release of a 35 billion parameter highly performant generative model. Command-R is a large language model with open weigh β’ 3 items β’ Updated 9 days ago β’ 12
C4AI Command R Plus Collection C4AI Command R+ is an open weights research release of a 104B billion parameter model with highly advanced capabilities. β’ 3 items β’ Updated 9 days ago β’ 18
view article Article LLM Comparison/Test: Llama 3 Instruct 70B + 8B HF/GGUF/EXL2 (20 versions tested and compared!) By wolfram β’ Apr 24 β’ 48
Prometheus 2: An Open Source Language Model Specialized in Evaluating Other Language Models Paper β’ 2405.01535 β’ Published about 1 month ago β’ 102
WildChat: 1M ChatGPT Interaction Logs in the Wild Paper β’ 2405.01470 β’ Published about 1 month ago β’ 53