Hymba: A Hybrid-head Architecture for Small Language Models Paper • 2411.13676 • Published Nov 20, 2024 • 40
Model Tells You Where to Merge: Adaptive KV Cache Merging for LLMs on Long-Context Tasks Paper • 2407.08454 • Published Jul 11, 2024
Unveiling and Harnessing Hidden Attention Sinks: Enhancing Large Language Models without Training through Attention Calibration Paper • 2406.15765 • Published Jun 22, 2024 • 1
MG-Verilog: Multi-grained Dataset Towards Enhanced LLM-assisted Verilog Generation Paper • 2407.01910 • Published Jul 2, 2024
EDGE-LLM: Enabling Efficient Large Language Model Adaptation on Edge Devices via Layerwise Unified Compression and Adaptive Layer Tuning and Voting Paper • 2406.15758 • Published Jun 22, 2024
ShiftAddLLM: Accelerating Pretrained LLMs via Post-Training Multiplication-Less Reparameterization Paper • 2406.05981 • Published Jun 10, 2024 • 12
GPT4AIGChip: Towards Next-Generation AI Accelerator Design Automation via Large Language Models Paper • 2309.10730 • Published Sep 19, 2023 • 2
Hint-Aug: Drawing Hints from Foundation Vision Transformers Towards Boosted Few-Shot Parameter-Efficient Tuning Paper • 2304.12520 • Published Apr 25, 2023 • 1
Master-ASR: Achieving Multilingual Scalability and Low-Resource Adaptation in ASR with Modular Learning Paper • 2306.15686 • Published Jun 23, 2023 • 1
MixRT: Mixed Neural Representations For Real-Time NeRF Rendering Paper • 2312.11841 • Published Dec 19, 2023 • 10
ShiftAddViT: Mixture of Multiplication Primitives Towards Efficient Vision Transformer Paper • 2306.06446 • Published Jun 10, 2023 • 1
NetDistiller: Empowering Tiny Deep Learning via In-Situ Distillation Paper • 2310.19820 • Published Oct 24, 2023 • 1