Nemotron 4 340B Collection Nemotron-4: open models for Synthetic Data Generation (SDG). Includes Base, Instruct, and Reward models. • 4 items • Updated 9 days ago • 152
Granite Code Models Collection A series of code models trained by IBM licensed under Apache 2.0 license. We release both the base pretrained and instruct models. • 25 items • Updated 2 days ago • 147
view article Article Run the strongest open-source LLM model: Llama3 70B with just a single 4GB GPU! By lyogavin • Apr 21 • 40
Phi-3 Collection Phi-3 family of small language and multi-modal models. Language models are available in short- and long-context lengths. • 23 items • Updated 15 days ago • 377
Phi-3 Technical Report: A Highly Capable Language Model Locally on Your Phone Paper • 2404.14219 • Published Apr 22 • 243
view article Article Introducing the LiveCodeBench Leaderboard - Holistic and Contamination-Free Evaluation of Code LLMs Apr 16 • 11
Awesome Document AI Collection A collection of open-source document AI 📄 📝 📈 • 27 items • Updated Mar 11 • 43
Sora: A Review on Background, Technology, Limitations, and Opportunities of Large Vision Models Paper • 2402.17177 • Published Feb 27 • 88
EMO: Emote Portrait Alive - Generating Expressive Portrait Videos with Audio2Video Diffusion Model under Weak Conditions Paper • 2402.17485 • Published Feb 27 • 184
The Era of 1-bit LLMs: All Large Language Models are in 1.58 Bits Paper • 2402.17764 • Published Feb 27 • 581
Sora参考论文 Collection OpenAI "Video generation models as world simulators"技术报告后面的参考论文,总共32篇。OpenAI的ImageGPT和Dalle3这两篇缺失,链接已补充到note中。 • 32 items • Updated Feb 18 • 53
OS-Copilot: Towards Generalist Computer Agents with Self-Improvement Paper • 2402.07456 • Published Feb 12 • 40
Text-to-Image Base Models Collection All text-to-image open source base models, with their respective license • 28 items • Updated May 10 • 19
Self-Discover: Large Language Models Self-Compose Reasoning Structures Paper • 2402.03620 • Published Feb 6 • 107
Qwen1.5 GGUF Collection GGUF quants for the new Qwen1.5 model (https://qwenlm.github.io/blog/qwen1.5/) • 5 items • Updated Feb 5 • 10
Qwen1.5 Collection Qwen1.5 is the improved version of Qwen, the large language model series developed by Alibaba Cloud. • 55 items • Updated Jun 6 • 202
Gemini: A Family of Highly Capable Multimodal Models Paper • 2312.11805 • Published Dec 19, 2023 • 45
LLM in a flash: Efficient Large Language Model Inference with Limited Memory Paper • 2312.11514 • Published Dec 12, 2023 • 255
Chain of Code: Reasoning with a Language Model-Augmented Code Emulator Paper • 2312.04474 • Published Dec 7, 2023 • 29
Alpha-CLIP: A CLIP Model Focusing on Wherever You Want Paper • 2312.03818 • Published Dec 6, 2023 • 31
Recent models: last 100 repos, sorted by creation date Collection The last 100 repos I have created. Sorted by creation date descending, so the most recently created repos appear at the top. • 121 items • Updated Jan 31 • 470
Seamless Communication Collection A significant step towards removing language barriers through expressive, fast and high-quality AI translation. • 16 items • Updated Jan 16 • 135
LucidDreamer: Domain-free Generation of 3D Gaussian Splatting Scenes Paper • 2311.13384 • Published Nov 22, 2023 • 48
FusionFrames: Efficient Architectural Aspects for Text-to-Video Generation Pipeline Paper • 2311.13073 • Published Nov 22, 2023 • 53
DragonDiffusion: Enabling Drag-style Manipulation on Diffusion Models Paper • 2307.02421 • Published Jul 5, 2023 • 34