samhog
/

psychology-llama-rlaif

Model card Files Files and versions Community

samhog commited on Jun 19, 2023

Commit

7e1a0e3

•

1 Parent(s): 58d577c

Update README.md

Files changed (1) hide show

README.md +1 -1

README.md CHANGED Viewed

@@ -1,5 +1,5 @@
 # Psychology LLaMA RLAIF 🦙🙋‍♂🤖
-This is a LLaMA-7B-based language model trained in the field of psychology using Reinforcement Learning from AI. To learn more about RLAIF, I recommend [this](https://www.anthropic.com/index/constitutional-ai-harmlessness-from-ai-feedback) great, revolutionizing 2022 paper by Anthropic. For some insights in the process of fine-tuning using RLHF, which is a very similar process, there is a great blogpost on Hugging Face found [here!](https://huggingface.co/blog/stackllama)
 **Links**: [Reward model](https://huggingface.co/samhog/RLAIF-psychology-alpaca-rm); [Base model](https://huggingface.co/samhog/psychology-llama-merged)

 # Psychology LLaMA RLAIF 🦙🙋‍♂🤖
+This is a LLaMA-7B-based language model trained in the field of psychology using Reinforcement Learning from AI Feedback. To learn more about RLAIF, I recommend [this](https://www.anthropic.com/index/constitutional-ai-harmlessness-from-ai-feedback) great, revolutionizing 2022 paper by Anthropic. For some insights in the process of fine-tuning using RLHF, which is a very similar process, there is a great blogpost on Hugging Face found [here!](https://huggingface.co/blog/stackllama)
 **Links**: [Reward model](https://huggingface.co/samhog/RLAIF-psychology-alpaca-rm); [Base model](https://huggingface.co/samhog/psychology-llama-merged)