3PO Models 3PO family methods trained on DapoMath-17k using Olmo3-IVON-SFT-7B and Qwen2.5Math-IVON-SFT-7B BayesRL/Olmo3-B3PO-7B Text Generation ⢠7B ⢠Updated about 3 hours ago BayesRL/Olmo3-M3PO-7B Text Generation ⢠7B ⢠Updated about 3 hours ago ⢠3 BayesRL/Olmo3-C3PO-7B Text Generation ⢠7B ⢠Updated about 3 hours ago ⢠1 BayesRL/Olmo3-M3POPlus-7B Text Generation ⢠7B ⢠Updated about 3 hours ago ⢠1
Warm-started Checkpoints A collection of three models trained on the Nemotron Post Training Dataset for reasoning tasks with IVON BayesRL/Llama3.1-IVON-SFT-8B Text Generation ⢠8B ⢠Updated about 3 hours ago ⢠64 BayesRL/Qwen2.5Math-IVON-SFT-7B 8B ⢠Updated Apr 7 ⢠2.95k BayesRL/Olmo3-IVON-SFT-7B Text Generation ⢠7B ⢠Updated about 3 hours ago ⢠1.77k
3PO Models 3PO family methods trained on DapoMath-17k using Olmo3-IVON-SFT-7B and Qwen2.5Math-IVON-SFT-7B BayesRL/Olmo3-B3PO-7B Text Generation ⢠7B ⢠Updated about 3 hours ago BayesRL/Olmo3-M3PO-7B Text Generation ⢠7B ⢠Updated about 3 hours ago ⢠3 BayesRL/Olmo3-C3PO-7B Text Generation ⢠7B ⢠Updated about 3 hours ago ⢠1 BayesRL/Olmo3-M3POPlus-7B Text Generation ⢠7B ⢠Updated about 3 hours ago ⢠1
Warm-started Checkpoints A collection of three models trained on the Nemotron Post Training Dataset for reasoning tasks with IVON BayesRL/Llama3.1-IVON-SFT-8B Text Generation ⢠8B ⢠Updated about 3 hours ago ⢠64 BayesRL/Qwen2.5Math-IVON-SFT-7B 8B ⢠Updated Apr 7 ⢠2.95k BayesRL/Olmo3-IVON-SFT-7B Text Generation ⢠7B ⢠Updated about 3 hours ago ⢠1.77k