view article Article Expanding Model Context and Creating Chat Models with a Single Click By maywell • about 1 month ago • 33
view article Article Introducing Idefics2: A Powerful 8B Vision-Language Model for the community Apr 15 • 133
PoSE: Efficient Context Window Extension of LLMs via Positional Skip-wise Training Paper • 2309.10400 • Published Sep 19, 2023 • 22
ablation-models Collection 1.8B models trained on 350BT to compare different pretraining datasets • 7 items • Updated 24 days ago • 20
ORPO: Monolithic Preference Optimization without Reference Model Paper • 2403.07691 • Published Mar 12 • 58
AutoCrawler: A Progressive Understanding Web Agent for Web Crawler Generation Paper • 2404.12753 • Published Apr 19 • 38
SDXL: Improving Latent Diffusion Models for High-Resolution Image Synthesis Paper • 2307.01952 • Published Jul 4, 2023 • 74
MGM Collection Official model collection for the paper "Mini-Gemini: Mining the Potential of Multi-modality Vision Language Models" • 13 items • Updated 26 days ago • 43
The Era of 1-bit LLMs: All Large Language Models are in 1.58 Bits Paper • 2402.17764 • Published Feb 27 • 567
My favorite models (benchs+usage/feel+innovation) Collection How I select models ? First, I bench them, to trim the overfit & dumbified models. Then, I test them. The smartest to my taste end-up here. • 10 items • Updated 18 days ago • 7
Nemotron 3 8B Collection The Nemotron 3 8B Family of models is optimized for building production-ready generative AI applications for the enterprise. • 5 items • Updated Feb 19 • 37
Textbooks Are All You Need II: phi-1.5 technical report Paper • 2309.05463 • Published Sep 11, 2023 • 84
Zephyr 7B Collection Models, datasets, and demos associated with Zephyr 7B. For code to train the models, see: https://github.com/huggingface/alignment-handbook • 9 items • Updated Apr 12 • 138