Spaces:
Running
on
CPU Upgrade
Running
on
CPU Upgrade
update the note
Browse files
app.py
CHANGED
@@ -34,7 +34,7 @@ As part of the BigCode project, we released and maintain [The Stack V2](https://
|
|
34 |
|
35 |
This tool lets you check if a repository under a given username is part of The Stack dataset. Would you like to have your data removed from future versions of The Stack? You can opt-out following the instructions [here](https://www.bigcode-project.org/docs/about/the-stack/#how-can-i-request-that-my-data-be-removed-from-the-stack). Note that previous opt-outs might still be displayed in the release candidate (denoted with "-rc"), which will be removed for the release.
|
36 |
|
37 |
-
**Note
|
38 |
|
39 |
**Data source**:\
|
40 |
<img src="https://annex.softwareheritage.org/public/logo/software-heritage-logo-title.2048px.png" alt="Logo" style="height: 3em; vertical-align: middle;" />
|
|
|
34 |
|
35 |
This tool lets you check if a repository under a given username is part of The Stack dataset. Would you like to have your data removed from future versions of The Stack? You can opt-out following the instructions [here](https://www.bigcode-project.org/docs/about/the-stack/#how-can-i-request-that-my-data-be-removed-from-the-stack). Note that previous opt-outs might still be displayed in the release candidate (denoted with "-rc"), which will be removed for the release.
|
36 |
|
37 |
+
**Note:** The Stack v2.0 is built from public GitHub code provided by the [Software Heriage Archive](https://archive.softwareheritage.org/). It may include repositories that are no longer present on GitHub but were archived by Software Heritage. Before training the StarCoder 1 and 2 models an additional PII pipeline was run to remove names, emails, passwords and API keys from the code files. For more information see the [paper](https://arxiv.org/abs/2402.19173).
|
38 |
|
39 |
**Data source**:\
|
40 |
<img src="https://annex.softwareheritage.org/public/logo/software-heritage-logo-title.2048px.png" alt="Logo" style="height: 3em; vertical-align: middle;" />
|