view article Article KV Caching Explained: Optimizing Transformer Inference Efficiency By not-lain • 9 days ago • 23
view article Article Mastering Long Contexts in LLMs with KVPress By nvidia and 1 other • 15 days ago • 59
Towards Large Reasoning Models: A Survey of Reinforced Reasoning with Large Language Models Paper • 2501.09686 • Published 22 days ago • 36
Inference-Time Scaling for Diffusion Models beyond Scaling Denoising Steps Paper • 2501.09732 • Published 22 days ago • 67
Tools 4 Agents Collection This is a collection of spaces on the hub that are useful for building agents. https://huggingface.co/docs/smolagents/en/tutorials/tools • 5 items • Updated 24 days ago • 4
mechanistic interpretability with sparse autoencoders Collection A collection of papers that I found useful for learning about using Sparse Autoencoders for finding interpretable features in language models • 9 items • Updated Sep 3, 2024 • 1
view article Article FineWeb2-C: Help Build Better Language Models in Your Language By davanstrien and 5 others • Dec 23, 2024 • 18
Reflections from the 2024 Large Language Model (LLM) Hackathon for Applications in Materials Science and Chemistry Paper • 2411.15221 • Published Nov 20, 2024 • 28
view article Article Efficient Deep Learning: A Comprehensive Overview of Optimization Techniques 👐 📚 By Isayoften • Aug 26, 2024 • 49
timm tiny test models Collection A collection of very small (~300-500k parameter) models at 160x160 resolution, for testing purposes. Trained on ImageNet-1k. • 13 items • Updated Oct 2, 2024 • 5
Llama 3.3 (All Versions) Collection Meta's new Llama 3.3 (70B) model in all formats. Includes GGUF, 4-bit bnb and original versions. • 3 items • Updated 3 days ago • 35
Unsloth 4-bit Dynamic Quants Collection Unsloths Dynamic 4bit Quants selectively skips quantizing certain parameters; greatly improving accuracy while only using <10% more VRAM than BnB 4bit • 20 items • Updated 1 day ago • 39
view article Article Making LLMs even more accessible with bitsandbytes, 4-bit quantization and QLoRA May 24, 2023 • 112