nbeerbower
commited on
Commit
•
f5d5149
1
Parent(s):
e4dd938
Update README.md
Browse files
README.md
CHANGED
@@ -11,4 +11,20 @@ base_model:
|
|
11 |
- mlabonne/Meta-Llama-3.1-8B-Instruct-abliterated
|
12 |
---
|
13 |
|
14 |
-
# Llama3.1-Allades-8B
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
11 |
- mlabonne/Meta-Llama-3.1-8B-Instruct-abliterated
|
12 |
---
|
13 |
|
14 |
+
# Llama3.1-Allades-8B
|
15 |
+
|
16 |
+
Allades finetunes abliterated Llama 3.1 with 5 datasets to improve creative writing, reasoning, and roleplay.
|
17 |
+
|
18 |
+
## Datasets
|
19 |
+
|
20 |
+
- [jondurbin/gutenberg-dpo-v0.1](https://huggingface.co/datasets/jondurbin/gutenberg-dpo-v0.1)
|
21 |
+
- [nbeerbower/gutenberg2-dpo](https://huggingface.co/datasets/nbeerbower/gutenberg2-dpo)
|
22 |
+
- [jondurbin/truthy-dpo-v0.1](https://huggingface.co/datasets/jondurbin/truthy-dpo-v0.1)
|
23 |
+
- [kyujinpy/orca_math_dpo](https://huggingface.co/datasets/kyujinpy/orca_math_dpo)
|
24 |
+
- [antiven0m/physical-reasoning-dpo](https://huggingface.co/datasets/antiven0m/physical-reasoning-dpo)
|
25 |
+
|
26 |
+
## Training
|
27 |
+
|
28 |
+
ORPO tuned for 1 epoch with 2x RTX 3090 (sponsored by [Schneewolf Labs](https://schneewolflabs.com)).
|
29 |
+
|
30 |
+
Data was prepared with [Llama 3.1 Instruct](https://www.llama.com/docs/model-cards-and-prompt-formats/llama3_1/).
|