🚩 Report

#137
by hayleecs - opened

No information about the training data. How are we supposed to compare it with LLaMa 2 models and how can we justify the change in perplexity on common datasets such as wikitext?

Sign up or log in to comment