File size: 936 Bytes
fcda435 cd7229c ad23529 2b48317 ad23529 |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 |
---
datasets:
- samhog/psychology-10k
---
# Psychology Alpaca 🍩
This is a LLaMA-7B language model trained on 10.000 psychology-related prompts and answers generated by ChatGPT. The model was trained on a single A100 GPU from Google Colab. The model shows some knowledge in the field of psychology and generally performs better than its base model parent.
### Background
This model was developed as part of a thesis project in the field of machine learning and psychology. It was used as a base model for further fine-tuning using reinforcement learning. The goal of the thesis was to compare reinforcement learning from *human feedback* and *AI feedback*. When the paper is available, it will be linked here!
**Links**: [RLHF model](https://huggingface.co/samhog/psychology-llama-rlhf); [RLAIF model](https://huggingface.co/samhog/psychology-llama-rlaif)
**Authors:**
Samuel Höglund, samhog@kth.se;
Josef Khedri, jkhedri@kth.se |