S1-M-7B-Beta
๐ Homepage | ๐ Our Official Code Repo | ๐ค S1-M Dataset (Beta)
S1-M-7B-Beta used for developing the algorithm "Simple Test-time Scaling in Multimodal Reasoning". By fine-tuning the base model Qwen/Qwen2-VL-7B-Instruct
on data with thinking tags <think>
and </think>
, the model acquired the think first, then response
paradigm, allowing for experiments on "Test-time Scaling".
Note: The current model is a development version, not the final official version.
- Downloads last month
- 3
Inference Providers
NEW
This model is not currently available via any of the supported Inference Providers.
The model cannot be deployed to the HF Inference API:
The model has no library tag.