Update README.md
Browse files
README.md
CHANGED
@@ -3,4 +3,13 @@ datasets:
|
|
3 |
- samhog/psychology-10k
|
4 |
---
|
5 |
|
6 |
-
Psychology Alpaca
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
3 |
- samhog/psychology-10k
|
4 |
---
|
5 |
|
6 |
+
# Psychology Alpaca 🍩
|
7 |
+
This is a LLaMA-7B language model trained on 10.000 psychology-related prompts and answers generated by ChatGPT. The model was trained on a single A100 GPU from Google Colab. The model shows some knowledge in the field of psychology and generally performs better than its base model parent.
|
8 |
+
|
9 |
+
### Background
|
10 |
+
This model was developed as part of a thesis project in the field of machine learning and psychology. It was used as a base model for further fine-tuning using reinforcement learning. The goal of the thesis was to compare reinforcement learning from *human feedback* and *AI feedback*. When the paper is available, it will be linked here!
|
11 |
+
|
12 |
+
|
13 |
+
**Authors:**
|
14 |
+
Samuel Höglund, samhog@kth.se;
|
15 |
+
Josef Khedri, jkhedri@kth.se
|