Chinese Tiny LLM: Pretraining a Chinese-Centric Large Language Model Paper • 2404.04167 • Published 27 days ago • 8 • 2
Sigma: Siamese Mamba Network for Multi-Modal Semantic Segmentation Paper • 2404.04256 • Published 27 days ago • 5 • 1
EE-LLM: Large-Scale Training and Inference of Early-Exit Large Language Models with 3D Parallelism Paper • 2312.04916 • Published Dec 8, 2023 • 5 • 7
LLaMA Beyond English: An Empirical Study on Language Capability Transfer Paper • 2401.01055 • Published Jan 2 • 49 • 3
Gemini: A Family of Highly Capable Multimodal Models Paper • 2312.11805 • Published Dec 19, 2023 • 44 • 10
A Challenger to GPT-4V? Early Explorations of Gemini in Visual Expertise Paper • 2312.12436 • Published Dec 19, 2023 • 12 • 3
LLM360: Towards Fully Transparent Open-Source LLMs Paper • 2312.06550 • Published Dec 11, 2023 • 52 • 3
VideoSwap: Customized Video Subject Swapping with Interactive Semantic Point Correspondence Paper • 2312.02087 • Published Dec 4, 2023 • 19 • 5
QMoE: Practical Sub-1-Bit Compression of Trillion-Parameter Models Paper • 2310.16795 • Published Oct 25, 2023 • 26 • 2
BitNet: Scaling 1-bit Transformers for Large Language Models Paper • 2310.11453 • Published Oct 17, 2023 • 93 • 12