opt-peter-1.3B / vocab.json
pszemraj's picture
add new checkpoint trained for a hundred steps with smaller max grad norm and weight decay
7a20e92
raw
history contribute delete
No virus
798 kB
File too large to display, you can check the raw version instead.