amirabdullah19852020 commited on
Commit
81b5c0c
1 Parent(s): 3292a22

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +1 -0
README.md CHANGED
@@ -10,6 +10,7 @@ tags:
10
 
11
  This is a [TRL language model](https://github.com/huggingface/trl) that has been fine-tuned with reinforcement learning to
12
  guide the model outputs according to a value, function, or human feedback. The model can be used for text generation.
 
13
 
14
  ## Usage
15
 
 
10
 
11
  This is a [TRL language model](https://github.com/huggingface/trl) that has been fine-tuned with reinforcement learning to
12
  guide the model outputs according to a value, function, or human feedback. The model can be used for text generation.
13
+ This was used as a test model in the reward interpretability study at https://arxiv.org/abs/2310.08164.
14
 
15
  ## Usage
16