training data
#4
by
simeneide
- opened
Hi, you write in the doc:
Training data
NB-GPT-J-6B was finetuned on NCC, the Norwegian Colossal Corpus, plus other Internet sources like Wikipedia, mC4, and OSCAR.
Is there any news source in these "other" category, and do you have an approximate amount?
versae
changed discussion status to
closed