kastan's picture
First attempt at supervised finetuning using kastan/rlhf-qa-conditional-generation-v2
2818f0a