UniTabE: A Universal Pretraining Protocol for Tabular Foundation Model in Data Science Paper • 2307.09249 • Published Jul 18, 2023
SegMamba: Long-range Sequential Modeling Mamba For 3D Medical Image Segmentation Paper • 2401.13560 • Published Jan 24 • 1
CMMU: A Benchmark for Chinese Multi-modal Multi-type Question Understanding and Reasoning Paper • 2401.14011 • Published Jan 25
AltDiffusion: A Multilingual Text-to-Image Diffusion Model Paper • 2308.09991 • Published Aug 19, 2023 • 3
AltCLIP: Altering the Language Encoder in CLIP for Extended Language Capabilities Paper • 2211.06679 • Published Nov 12, 2022 • 2
InfinityMATH: A Scalable Instruction Tuning Dataset in Programmatic Mathematical Reasoning Paper • 2408.07089 • Published Aug 9 • 13
AquilaMoE: Efficient Training for MoE Models with Scale-Up and Scale-Out Strategies Paper • 2408.06567 • Published Aug 13 • 2
CCI3.0-HQ: a large-scale Chinese dataset of high quality designed for pre-training large language models Paper • 2410.18505 • Published Oct 24 • 8
Infinity-MM: Scaling Multimodal Performance with Large-Scale and High-Quality Instruction Data Paper • 2410.18558 • Published Oct 24 • 18