OCR Hinders RAG: Evaluating the Cascading Impact of OCR on Retrieval-Augmented Generation Paper • 2412.02592 • Published Dec 3, 2024 • 20
Hymba: A Hybrid-head Architecture for Small Language Models Paper • 2411.13676 • Published Nov 20, 2024 • 40
Model Tells You Where to Merge: Adaptive KV Cache Merging for LLMs on Long-Context Tasks Paper • 2407.08454 • Published Jul 11, 2024
Unveiling and Harnessing Hidden Attention Sinks: Enhancing Large Language Models without Training through Attention Calibration Paper • 2406.15765 • Published Jun 22, 2024 • 1
MG-Verilog: Multi-grained Dataset Towards Enhanced LLM-assisted Verilog Generation Paper • 2407.01910 • Published Jul 2, 2024
EDGE-LLM: Enabling Efficient Large Language Model Adaptation on Edge Devices via Layerwise Unified Compression and Adaptive Layer Tuning and Voting Paper • 2406.15758 • Published Jun 22, 2024
Revealing Fine-Grained Values and Opinions in Large Language Models Paper • 2406.19238 • Published Jun 27, 2024 • 14
Phi-3 Technical Report: A Highly Capable Language Model Locally on Your Phone Paper • 2404.14219 • Published Apr 22, 2024 • 254
GPT4AIGChip: Towards Next-Generation AI Accelerator Design Automation via Large Language Models Paper • 2309.10730 • Published Sep 19, 2023 • 2
Hint-Aug: Drawing Hints from Foundation Vision Transformers Towards Boosted Few-Shot Parameter-Efficient Tuning Paper • 2304.12520 • Published Apr 25, 2023 • 1
Master-ASR: Achieving Multilingual Scalability and Low-Resource Adaptation in ASR with Modular Learning Paper • 2306.15686 • Published Jun 23, 2023 • 1
GeorgiaTech/0.0005_llama_nodpo_3iters_bs128_531lr_oldtrl_iter_3 Text Generation • Updated May 13, 2024 • 24
GeorgiaTech/0.0005_zephyr_withdpo_5551_4iters_bs256_newtrl_iter_3 Text Generation • Updated May 12, 2024 • 23
GeorgiaTech/0.0005_llama_nodpo_3iters_bs128_531lr_oldtrl_iter_2 Text Generation • Updated May 12, 2024 • 220
GeorgiaTech/0.0005_llama_nodpo_3iters_bs128_531lr_oldtrl_iter_1 Text Generation • Updated May 12, 2024 • 263