3 22 174

PeijieDong

pprp

https://pprp.github.io

AI & ML interests

Model Compression; Large Language Model;

Recent Activity

liked a model 7 days ago

nvidia/Llama-3_1-Nemotron-Ultra-253B-v1

upvoted a paper 7 days ago

InternVL3: Exploring Advanced Training and Test-Time Recipes for Open-Source Multimodal Models

liked a model 8 days ago

deepseek-ai/DeepSeek-R1

View all activity

Organizations

None yet

pprp's activity

liked a model 7 days ago

nvidia/Llama-3_1-Nemotron-Ultra-253B-v1

Text Generation • Updated 3 days ago • 16.6k • • 262

upvoted a paper 7 days ago

InternVL3: Exploring Advanced Training and Test-Time Recipes for Open-Source Multimodal Models

Paper • 2504.10479 • Published 7 days ago • 232

liked a model 8 days ago

deepseek-ai/DeepSeek-R1

Text Generation • Updated 26 days ago • 1.71M • • 12k

liked a model 13 days ago

AlphaGaO/DeepSeek-V3-0324-Fused-8E-39B-Unhealed-Preview

Text Generation • Updated 13 days ago • 18 • 1

liked a model 21 days ago

deepseek-ai/DeepSeek-R1-Distill-Qwen-1.5B

Text Generation • Updated Feb 24 • 1.69M • • 1.17k

liked 2 models 27 days ago

internlm/internlm3-8b-instruct

Text Generation • Updated Feb 11 • 53k • 217

internlm/internlm2_5-20b-chat

Text Generation • Updated Mar 13 • 1.68k • 93

liked a model 28 days ago

deepseek-ai/DeepSeek-V3-0324

Text Generation • Updated 26 days ago • 245k • • 2.7k

liked 2 models about 1 month ago

deepseek-ai/deepseek-moe-16b-base

Text Generation • Updated Jan 12, 2024 • 12.5k • 115

google/gemma-3-27b-it

Image-Text-to-Text • Updated Mar 21 • 656k • • 1.23k

liked 3 models about 2 months ago

liked a Space about 2 months ago

626

RWKV-Gradio-2

🚀

Generate text based on user prompts and settings

upvoted a paper about 2 months ago

Exploring the Limit of Outcome Reward for Learning Mathematical Reasoning

Paper • 2502.06781 • Published Feb 10 • 61

liked a Space about 2 months ago

195

LLM训练终极指南 | The Ultra-Scale Playbook

🔥

了解LLM训练的方方面面

liked a Space 2 months ago

2.49k

The Ultra-Scale Playbook

🌌

The ultimate guide to training LLM on large GPU Clusters

liked a model 2 months ago

cognitivecomputations/DeepSeek-R1-AWQ

Text Generation • Updated 24 days ago • 9.99k • 77

upvoted an article 2 months ago

Article

Accelerate Large Model Training using PyTorch Fully Sharded Data Parallel

May 2, 2022

• 4

upvoted a paper 2 months ago

Mediator: Memory-efficient LLM Merging with Less Parameter Conflicts and Uncertainty Based Routing

Paper • 2502.04411 • Published Feb 6 • 4