Article 4 Introducing ConTextual: How well can your Multimodal model jointly reason over text and image in text-rich scenes?
hbXNov/LLaDA-8B-Instruct-mlp2x_gelu-pretrain_blip558_v4-cont_200k_openllavanext_allava_gpt4omini Updated 3 days ago • 1
hbXNov/llama_8b_instruct_distill_r1_q1p5b_balanced_train_e6_lr5e-7_balanced_ckpt-4386 Updated Mar 2 • 2
hbXNov/distill_r1_qwen_1p5B_gpt_4o_verify_remove_think_processed Viewer • Updated Feb 27 • 8.02k • 21