OSim-4B (post-trained, text)

Post-trained (text) 4B checkpoint of OSim (OdysSim) for human-behavior / user simulation. Mirror of sunweiwei/OSim-4B. 4B sibling of cmu-lti/osim-8b-post.

Training

  • Base: Qwen3-4B -> OdysSim midtraining -> task-specific RL + expert consolidation.

Citation

Cite the OdysSim paper (Building Foundation Models for Human Behavior Simulation). Code: https://github.com/sunnweiwei/OdysSim

Downloads last month
15
Safetensors
Model size
4B params
Tensor type
BF16
·
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for cmu-lti/osim-4b

Finetuned
Qwen/Qwen3-4B
Finetuned
(708)
this model

Collection including cmu-lti/osim-4b