Why aren't all my public repos included?

#4
by yehiaserag - opened

I really appreciate what you guys are doing and how this project can benefit everyone on the planet.
I searched using the space and I could only find one of my public repos.
Why aren't the others included?

My repos aren’t in it either

Hi there! This first version of the stack only includes 30 programming languages (corresponding to the subfolders here: https://huggingface.co/datasets/bigcode/the-stack/tree/v1.0/data)
It also only includes permissively licensed repos. (https://huggingface.co/datasets/bigcode/the-stack#licensing-information)

Hi there! This first version of the stack only includes 30 programming languages (corresponding to the subfolders here: https://huggingface.co/datasets/bigcode/the-stack/tree/v1.0/data)
It also only includes permissively licensed repos. (https://huggingface.co/datasets/bigcode/the-stack#licensing-information)

I code in html and batch

BigCode org

Do your repositories have a clear license? If none can be detected or it is not considered permissive then they are excluded from the dataset.

I didn't have a license there

Code without a license (specifically without a permissive license) was not included in the dataset.

Closing the issue, but feel free to reopen it if you have more questions!

christopher changed discussion status to closed

Added licenses and hope those repos will be included in next versions

Sign up or log in to comment