nabeelshan
/

rlhf-gpt2-pipeline

Text Generation

reinforcement-learning

instruction-tuning

Model card Files Files and versions