Dataset

#1
by mrfakename - opened

Hi @Pclanglais ! This model looks really cool.
Are you planning to release the training dataset?
Thanks in advance!

Yes at some point. I still need a bit more data work to ensure proper attribution of each excerpts.

@Pclanglais wouldn’t most of these texts be in the public domain?

Yes definitely. But I prefer to give the opportunity to trace them if people want to build upon it. Nothing to do with copyright (not not complicated either: just need to match my ids).

Hi. The dataset is now available: https://huggingface.co/datasets/Pclanglais/MonadGPT

Wow, thanks! I didn’t know people made instruction datasets in the 17th century :)!

mrfakename changed discussion status to closed

Sign up or log in to comment