some1nostr
commited on
Update README.md
Browse files
README.md
CHANGED
@@ -7,12 +7,13 @@ license: apache-2.0
|
|
7 |
# Model Card for Ostrich
|
8 |
|
9 |
|
10 |
-
Contentious, judgemental, uncensored, can't agree with itself 32% of the time!
|
11 |
-
Trained a bit about nostr
|
12 |
-
Trained a bit about bitcoin
|
13 |
-
Trained a bit in the health domain
|
|
|
|
|
14 |
|
15 |
-
I am having success with chat template: <s> [INST] ... </s>
|
16 |
It may also work with ChatML format, though I see more repetitions when I use that.
|
17 |
|
18 |
|
@@ -42,10 +43,10 @@ The trainer, developer or uploader of this model does not assume any liability.
|
|
42 |
|
43 |
### Training Data
|
44 |
|
45 |
-
Nostr
|
46 |
-
Number of notes: 300k
|
47 |
|
48 |
### Training Procedure
|
49 |
|
50 |
LLaMa-Factory is used to train on 2x3090! fsdp_qlora is the technique.
|
51 |
-
|
|
|
|
7 |
# Model Card for Ostrich
|
8 |
|
9 |
|
10 |
+
- Contentious, judgemental, uncensored, can't agree with itself 32% of the time!
|
11 |
+
- Trained a bit about nostr
|
12 |
+
- Trained a bit about bitcoin
|
13 |
+
- Trained a bit in the health domain
|
14 |
+
|
15 |
+
I am having success with chat template: \<s\> \[INST\] ... \<\/s\>
|
16 |
|
|
|
17 |
It may also work with ChatML format, though I see more repetitions when I use that.
|
18 |
|
19 |
|
|
|
43 |
|
44 |
### Training Data
|
45 |
|
46 |
+
Nostr related info from web and nostr itself, bitcoin related info.
|
|
|
47 |
|
48 |
### Training Procedure
|
49 |
|
50 |
LLaMa-Factory is used to train on 2x3090! fsdp_qlora is the technique.
|
51 |
+
|
52 |
+
It took ~185 hours for a dataset of 122MB.
|