some1nostr
/

Ostrich-70B

Text Generation

Inference Endpoints

Model card Files Files and versions Community

some1nostr commited on Apr 18, 2024

Commit

42e3ca2

·

verified ·

1 Parent(s): 73f3a59

Update README.md

Files changed (1) hide show

README.md +9 -8

README.md CHANGED Viewed

@@ -7,12 +7,13 @@ license: apache-2.0
 # Model Card for Ostrich
-Contentious, judgemental, uncensored, can't agree with itself 32% of the time!
-Trained a bit about nostr
-Trained a bit about bitcoin
-Trained a bit in the health domain
-I am having success with chat template: <s> [INST] ... </s>
 It may also work with ChatML format, though I see more repetitions when I use that.
@@ -42,10 +43,10 @@ The trainer, developer or uploader of this model does not assume any liability.
 ### Training Data
-Nostr notes, kind=1, longer notes are taken from reputable accounts.
-Number of notes: 300k
 ### Training Procedure
 LLaMa-Factory is used to train on 2x3090! fsdp_qlora is the technique.
-It took ~170 hours for a dataset of 120MB.

 # Model Card for Ostrich
+- Contentious, judgemental, uncensored, can't agree with itself 32% of the time!
+- Trained a bit about nostr
+- Trained a bit about bitcoin
+- Trained a bit in the health domain
+I am having success with chat template: \<s\> \[INST\] ... \<\/s\>
 It may also work with ChatML format, though I see more repetitions when I use that.
 ### Training Data
+Nostr related info from web and nostr itself, bitcoin related info.
 ### Training Procedure
 LLaMa-Factory is used to train on 2x3090! fsdp_qlora is the technique.
+It took ~185 hours for a dataset of 122MB.