valerielucro's picture
second iteration Qlora with DPO on full gsm8k preference dataset version 2.1 and 1 epoch and rank 64, beta 0.6
bf3aa78 verified
raw
history blame contribute delete
437 Bytes
{
"bos_token": {
"content": "<s>",
"lstrip": false,
"normalized": false,
"rstrip": false,
"single_word": false
},
"eos_token": {
"content": "</s>",
"lstrip": false,
"normalized": false,
"rstrip": false,
"single_word": false
},
"pad_token": "</s>",
"unk_token": {
"content": "<unk>",
"lstrip": false,
"normalized": false,
"rstrip": false,
"single_word": false
}
}