joshdevins's picture
New training version with all tokens labelled
def1aef unverified
epoch = 3.0
train_runtime = 6988.1143
train_samples_per_second = 0.754