view article Article DS-MoE: Making MoE Models More Efficient and Less Memory-Intensive By bpan • 30 days ago • 26
MiniCheck & LLM-AggreFact Collection MiniCheck: Efficient Fact-Checking of LLMs on Grounding Documents • 4 items • Updated 12 days ago • 3
Llama3-ChatQA-1.5 Collection Llama3-ChatQA-1.5 models excel at conversational question answering (QA) and retrieval-augmented generation (RAG). • 6 items • Updated 5 days ago • 28
ViTamin Family Collection Designing Scalable Vision Models in the Vision-language Era. The best performing model is 'jienengchen/ViTamin-XL-384px'. • 16 items • Updated 28 days ago • 6
OpenCLIP DataComp Collection OpenCLIP models trained on DataComp (https://huggingface.co/papers/2304.14108). • 6 items • Updated Oct 9, 2023 • 6
SigLIP Collection Contrastive (sigmoid) image-text models from https://arxiv.org/abs/2303.15343 • 8 items • Updated 30 days ago • 23
Stable Code Collection Suite of developer assistant models • 5 items • Updated about 1 month ago • 33
TextSquare: Scaling up Text-Centric Visual Instruction Tuning Paper • 2404.12803 • Published 20 days ago • 27
Rephrase and Respond: Let Large Language Models Ask Better Questions for Themselves Paper • 2311.04205 • Published Nov 7, 2023 • 5
LLaVA-LLaMA-3 Collection Reproduction of various LLaVA models based on LLaMA-3 backbone. • 3 items • Updated 15 days ago • 2
MGM Collection Official model collection for the paper "Mini-Gemini: Mining the Potential of Multi-modality Vision Language Models" • 13 items • Updated 6 days ago • 43
MGM-Data Collection Official data collection for the paper "Mini-Gemini: Mining the Potential of Multi-modality Vision Language Models" • 2 items • Updated 18 days ago • 6
ShapeLLM Collection Model collections of ShapeLLM, Universal 3D Object Understanding for Embodied Interaction. • 7 items • Updated Mar 7 • 1
NExT: Teaching Large Language Models to Reason about Code Execution Paper • 2404.14662 • Published 16 days ago • 3
DreamLLM Collection [ICLR 2024 Spotlight] DreamLLM: Synergistic Multimodal Comprehension and Creation (https://arxiv.org/abs/2309.11499) • 6 items • Updated Mar 22 • 2
LLaVA++ (LLaMA-3 and Phi-3-Mini) Collection Extending Visual Capabilities of LLaVA with LLaMA-3 and Phi-3 • 11 items • Updated 9 days ago • 21
Meta Llama 3 Collection This collection hosts the transformers and original repos of the Meta Llama 3 and Llama Guard 2 releases • 5 items • Updated 20 days ago • 487
Toward Self-Improvement of LLMs via Imagination, Searching, and Criticizing Paper • 2404.12253 • Published 20 days ago • 49
Instruct and Extract: Instruction Tuning for On-Demand Information Extraction Paper • 2310.16040 • Published Oct 24, 2023 • 1
CiteSum: Citation Text-guided Scientific Extreme Summarization and Domain Adaptation with Limited Supervision Paper • 2205.06207 • Published May 12, 2022 • 1
AIF Datasets (with distilabel) Collection Small to medium size datasets either: synthetically generated, labelled with AI Feedback (AIF), or both • 3 items • Updated 2 days ago • 1
DIBT-SPIN Collection Collection of models and datasets fine tuned on the Data Is Better Together using SPIN trainer. • 46 items • Updated Mar 22 • 1
VisCoT Collection Visual CoT: Unleashing Chain-of-Thought Reasoning in the Multi-Modal Language Model • 4 items • Updated Mar 23 • 1
GNER Collection We introduce GNER, a Generative Named Entity Recognition framework, which demonstrates enhanced zero-shot capabilities across unseen entity domains. • 7 items • Updated Feb 28 • 6
Zephyr ORPO Collection Models and datasets to align LLMs with Odds Ratio Preference Optimisation (ORPO). Recipes here: https://github.com/huggingface/alignment-handbook • 3 items • Updated 27 days ago • 13
Argilla Preference Formatting Collection ORPO in Axolotl requires this format. These are sets using it. Format should be like this: https://gist.github.com/xzuyn/765157fa27738a9888dcb4e0aa3f5 • 5 items • Updated Apr 7 • 1
Self-Play Fine-Tuning of Diffusion Models for Text-to-Image Generation Paper • 2402.10210 • Published Feb 15 • 28
Mantis Collection Mantis model family optimized for multi-image reasoning with interleaved text/image format • 9 items • Updated 1 day ago • 3
The Big Benchmarks Collection Collection Gathering benchmark spaces on the hub (beyond the Open LLM Leaderboard) • 12 items • Updated Feb 7 • 74
SymbolicAI: A framework for logic-based approaches combining generative models and solvers Paper • 2402.00854 • Published Feb 1 • 18
MosaicBERT Collection A collection of BERT-based models of different sequence lengths trained on the C4 dataset. Details: https://mosaicbert.github.io/ • 5 items • Updated Dec 27, 2023 • 3
PDF Document / OCR Datasets Collection Document datasets with .pdf files that are usable with pixparse libraries and tools. • 2 items • Updated Mar 30 • 36
Text-to-Image Base Models Collection All text-to-image open source base models, with their respective license • 28 items • Updated Feb 15 • 17
Sparse Feature Circuits: Discovering and Editing Interpretable Causal Graphs in Language Models Paper • 2403.19647 • Published Mar 28 • 3
Efficient Estimation of Word Representations in Vector Space Paper • 1301.3781 • Published Jan 16, 2013 • 6