Corianas commited on
Commit
5f37180
β€’
1 Parent(s): 213fd74

Create README.md

Browse files
Files changed (1) hide show
  1. README.md +56 -0
README.md ADDED
@@ -0,0 +1,56 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ license: cc-by-nc-sa-4.0
3
+ datasets:
4
+ - roneneldan/TinyStories
5
+ ---
6
+ This is a character (english a-z 0-9 and so on) trained model following Andrej karpathy's llama.c project https://github.com/karpathy/llama2.c on both TinyStories and my own internal similar dataset I made. the last 150k is from a subset of cosmopedia I extracted for younger people.
7
+
8
+ Trained for 49,152,000,000 tokens
9
+
10
+ for it to see/output Uppercase letters this model uses a Shift-Key modifier before the letter to become uppercase, and has never been trained on actual uppercase letters.
11
+
12
+ This modifier is ↨ and here are the functions I use to convert from straight text to the modified format and back.
13
+ ```
14
+ def add_caseifer(text):
15
+ # Using list comprehension for more efficient concatenation
16
+ return ''.join(['↨' + char.lower() if char.isupper() else char for char in text
17
+
18
+ def remove_caseifer(text):
19
+ new_text = ""
20
+ i = 0
21
+ while i < len(text):
22
+ if text[i] == "↨":
23
+ if i+1 < len(text):
24
+ new_text += text[i+1].upper()
25
+ i += 1
26
+ else:
27
+ pass # skip this index
28
+ else:
29
+ new_text += text[i]
30
+ i += 1
31
+ return new_text
32
+ ```
33
+
34
+ As such for test strings to use in chat try using somthing like:
35
+ ```
36
+ ↨hello, my name is ↨clara and ↨i like
37
+ ```
38
+
39
+ Run history:
40
+ iter β–β–β–β–‚β–‚β–‚β–‚β–‚β–‚β–ƒβ–ƒβ–ƒβ–ƒβ–ƒβ–ƒβ–„β–„β–„β–„β–„β–…β–…β–…β–…β–…β–…β–†β–†β–†β–†β–†β–‡β–‡β–‡β–‡β–‡β–‡β–ˆβ–ˆβ–ˆ
41
+ loss/train β–ˆβ–…β–„β–…β–ƒβ–…β–„β–ƒβ–„β–„β–„β–„β–ƒβ–ƒβ–„β–„β–‚β–β–‚β–‚β–ƒβ–ƒβ–‚β–ƒβ–ƒβ–ƒβ–‚β–ƒβ–‚β–ƒβ–‚β–‚β–‚β–‚β–ƒβ–β–‚β–‚β–β–‚
42
+ loss/val β–ˆβ–‡β–†β–…β–…β–„β–„β–„β–„β–ƒβ–ƒβ–ƒβ–ƒβ–ƒβ–ƒβ–‚β–‚β–‚β–‚β–‚β–‚β–‚β–‚β–‚β–‚β–‚β–‚β–‚β–β–β–β–β–β–β–β–β–β–β–β–
43
+ lr ▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁
44
+ mfu β–…β–…β–…β–„β–…β–…β–…β–…β–…β–…β–„β–…β–…β–…β–β–…β–…β–…β–„β–…β–…β–…β–…β–ˆβ–…β–…β–…β–…β–…β–…β–…β–…β–…β–…β–…β–…β–ˆβ–…β–…β–ˆ
45
+ step_time β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‡β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‡β–ˆβ–ˆβ–ˆβ–ˆβ–‡β–ˆβ–ˆβ–‚β–‡β–ˆβ–ˆβ–‡β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‡β–‡β–‡β–β–‡β–ˆβ–
46
+ tokens β–β–β–β–‚β–‚β–‚β–‚β–‚β–‚β–ƒβ–ƒβ–ƒβ–ƒβ–ƒβ–ƒβ–„β–„β–„β–„β–„β–…β–…β–…β–…β–…β–…β–†β–†β–†β–†β–†β–‡β–‡β–‡β–‡β–‡β–‡β–ˆβ–ˆβ–ˆ
47
+
48
+ Run summary:
49
+ iter 500000
50
+ loss/train 0.48935
51
+ loss/val 0.45042
52
+ lr 1e-05
53
+ mfu 9.31042
54
+ step_time 63441.47873
55
+ tokens 49152000000
56
+