Meng Li's picture

Meng Li

ml-pku

AI & ML interests

None yet

Recent Activity

authored a paper 12 days ago

ProPD: Dynamic Token Tree Pruning and Generation for LLM Parallel Decoding

authored a paper 12 days ago

HybriMoE: Hybrid CPU-GPU Scheduling and Cache Management for Efficient MoE Inference

authored a paper 12 days ago

MPCache: MPC-Friendly KV Cache Eviction for Efficient Private Large Language Model Inference

View all activity

Organizations

None yet

ml-pku's activity

authored 9 papers 12 days ago

ProPD: Dynamic Token Tree Pruning and Generation for LLM Parallel Decoding

Paper • 2402.13485 • Published Feb 21, 2024

HybriMoE: Hybrid CPU-GPU Scheduling and Cache Management for Efficient MoE Inference

Paper • 2504.05897 • Published 14 days ago • 13

MPCache: MPC-Friendly KV Cache Eviction for Efficient Private Large Language Model Inference

Paper • 2501.06807 • Published Jan 12

AdapMoE: Adaptive Sensitivity-based Expert Gating and Management for Efficient MoE Inference

Paper • 2408.10284 • Published Aug 19, 2024 • 1

MPCViT: Searching for Accurate and Efficient MPC-Friendly Vision Transformer with Heterogeneous Attention

Paper • 2211.13955 • Published Nov 25, 2022

Mixture-of-Supernets: Improving Weight-Sharing Supernet Training with Architecture-Routed Mixture-of-Experts

Paper • 2306.04845 • Published Jun 8, 2023 • 4

AlphaNet: Improved Training of Supernets with Alpha-Divergence

Paper • 2102.07954 • Published Feb 16, 2021 • 2

AttentiveNAS: Improving Neural Architecture Search via Attentive Sampling

Paper • 2011.09011 • Published Nov 18, 2020 • 2

Co-Exploration of Neural Architectures and Heterogeneous ASIC Accelerator Designs Targeting Multiple Tasks

Paper • 2002.04116 • Published Feb 10, 2020 • 1