adamo1139 commited on
Commit
afda9e6
1 Parent(s): b6f474f

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +10 -0
README.md CHANGED
@@ -3,3 +3,13 @@ license: other
3
  license_name: other
4
  license_link: LICENSE
5
  ---
 
 
 
 
 
 
 
 
 
 
 
3
  license_name: other
4
  license_link: LICENSE
5
  ---
6
+ Mistral 7B model fine-tuned on AEZAKMI v1 dataset that is derived from airoboros 2.2.1 and airoboros 2.2.
7
+ Finetuned with axolotl, using qlora and nf4 double quant, around 2 epochs, batch size 8, lr 0.00008, lr scheduler cosine. Scheduled training was 5 epochs, but loss seemed fine after 2 so I finished it quicker.
8
+ Training took around 10 hours on single RTX 3090 Ti.
9
+
10
+ Main feature of this model is that it's output is free of refusals and it feels somehow more natural.
11
+ Prompt format is standard chatml.
12
+ Don't expect it to be good at math, riddles or be crazy smart. My end goal with AEZAKMI is to create a cozy free chatbot.
13
+
14
+
15
+ Not sure what license it needs to have, given license of airoboros dataset. I'll leave it as other for now.