some1nostr commited on
Commit
42e3ca2
·
verified ·
1 Parent(s): 73f3a59

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +9 -8
README.md CHANGED
@@ -7,12 +7,13 @@ license: apache-2.0
7
  # Model Card for Ostrich
8
 
9
 
10
- Contentious, judgemental, uncensored, can't agree with itself 32% of the time!
11
- Trained a bit about nostr
12
- Trained a bit about bitcoin
13
- Trained a bit in the health domain
 
 
14
 
15
- I am having success with chat template: <s> [INST] ... </s>
16
  It may also work with ChatML format, though I see more repetitions when I use that.
17
 
18
 
@@ -42,10 +43,10 @@ The trainer, developer or uploader of this model does not assume any liability.
42
 
43
  ### Training Data
44
 
45
- Nostr notes, kind=1, longer notes are taken from reputable accounts.
46
- Number of notes: 300k
47
 
48
  ### Training Procedure
49
 
50
  LLaMa-Factory is used to train on 2x3090! fsdp_qlora is the technique.
51
- It took ~170 hours for a dataset of 120MB.
 
 
7
  # Model Card for Ostrich
8
 
9
 
10
+ - Contentious, judgemental, uncensored, can't agree with itself 32% of the time!
11
+ - Trained a bit about nostr
12
+ - Trained a bit about bitcoin
13
+ - Trained a bit in the health domain
14
+
15
+ I am having success with chat template: \<s\> \[INST\] ... \<\/s\>
16
 
 
17
  It may also work with ChatML format, though I see more repetitions when I use that.
18
 
19
 
 
43
 
44
  ### Training Data
45
 
46
+ Nostr related info from web and nostr itself, bitcoin related info.
 
47
 
48
  ### Training Procedure
49
 
50
  LLaMa-Factory is used to train on 2x3090! fsdp_qlora is the technique.
51
+
52
+ It took ~185 hours for a dataset of 122MB.