pt-sk
/

GPT2_NonToxic

Text Generation

text-generation-inference

Inference Endpoints

Model card Files Files and versions Community

pt-sk commited on Jul 15

Commit

9e74c5b

•

1 Parent(s): 3c33df0

Update README.md

Files changed (1) hide show

README.md +4 -1

README.md CHANGED Viewed

@@ -1,7 +1,10 @@
 ---
 license: mit
 datasets: pt-sk/toxic_classification
-tags: ["PPO", "RLHF"]
 ---
 Aligning the model using Proximal Policy Optimization (PPO). The goal is to train the model to generate non-toxic reviews. The training process utilizes the `trl` library for reinforcement learning, the `transformers` library for model handling, and `datasets` for dataset management.
 Implementation code is available here: [GitHub](https://github.com/sathishkumar67/GPT-2-Non-Toxic-RLHF)

 ---
 license: mit
 datasets: pt-sk/toxic_classification
+tags:
+- PPO
+- RLHF
+pipeline_tag: text-generation
 ---
 Aligning the model using Proximal Policy Optimization (PPO). The goal is to train the model to generate non-toxic reviews. The training process utilizes the `trl` library for reinforcement learning, the `transformers` library for model handling, and `datasets` for dataset management.
 Implementation code is available here: [GitHub](https://github.com/sathishkumar67/GPT-2-Non-Toxic-RLHF)