Towards Best Practices for Open Datasets for LLM Training Paper • 2501.08365 • Published 28 days ago • 54
bluepen5805/DeepSeek-R1-Distill-Qwen-14B-Japanese-gguf Text Generation • Updated 15 days ago • 18.1k • 31
TinySwallow Collection Compact Japanese models trained with "TAID: Temporally Adaptive Interpolated Distillation for Efficient Knowledge Transfer in Language Models" • 5 items • Updated 13 days ago • 13
EmbodiedEval: Evaluate Multimodal LLMs as Embodied Agents Paper • 2501.11858 • Published 22 days ago • 5