joey00072 commited on
Commit
10c4986
1 Parent(s): 39ca915

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +23 -0
README.md CHANGED
@@ -1,3 +1,26 @@
1
  ---
2
  license: mit
3
  ---
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
  ---
2
  license: mit
3
  ---
4
+
5
+ ghostmax/softmax1 trained on tinystories for 500M tokens.
6
+
7
+ ckpt will be out_* files trained 2 models 1 with regular softmax
8
+
9
+ REPO: https://github.com/joey00072/llama2.c/tree/ghostmax
10
+
11
+ softmax: https://wandb.ai/shubhamchoudhari00072/ghostmax/reports/loss-val-23-08-25-21-48-58---Vmlldzo1MjM0MjUw
12
+ ghostmax: https://wandb.ai/shubhamchoudhari00072/ghostmax/reports/loss-val-23-08-25-21-50-15---Vmlldzo1MjM0MjYw
13
+ <iframe src="https://wandb.ai/shubhamchoudhari00072/ghostmax/runs/cbeei9uh?workspace=user-shubhamchoudhari00072" style="border:none;height:1024px;width:100%">
14
+
15
+ ```python
16
+ def softmax(x, dim=None):
17
+ e_x = torch.exp(x - torch.max(x, dim=dim, keepdim=True)[0])
18
+ return e_x / e_x.sum(dim=dim, keepdim=True)
19
+
20
+ def ghostmax(x, dim=None):
21
+ e_x = torch.exp(x - torch.max(x, dim=dim, keepdim=True)[0])
22
+ return e_x / (1+e_x.sum(dim=dim, keepdim=True) )
23
+
24
+ ```
25
+
26
+