EN-T: Optimizing Tensor Computing Engines Performance via Encoder-Based Methodology Paper • 2404.11887 • Published Apr 18
LUT Tensor Core: Lookup Table Enables Efficient Low-Bit LLM Inference Acceleration Paper • 2408.06003 • Published Aug 12
SeerAttention: Learning Intrinsic Sparse Attention in Your LLMs Paper • 2410.13276 • Published Oct 17 • 25
SeerAttention: Learning Intrinsic Sparse Attention in Your LLMs Paper • 2410.13276 • Published Oct 17 • 25
T-MAC: CPU Renaissance via Table Lookup for Low-Bit LLM Deployment on Edge Paper • 2407.00088 • Published Jun 25 • 10