adapter trained with DPO on the gsm8k preference dataset with cot and 1 epoch 6b24dac verified valerielucro commited on Jun 26
adapter trained with DPO on the gsm8k preference dataset with cot and 1 epoch 487d117 verified valerielucro commited on Jun 26