loubnabnl HF staff commited on
Commit
228dc2e
1 Parent(s): da804e0

Update datasets/github_code.md

Browse files
Files changed (1) hide show
  1. datasets/github_code.md +2 -2
datasets/github_code.md CHANGED
@@ -17,6 +17,6 @@ print(next(iter(ds)))
17
  }
18
 
19
  ```
20
- You can see that in addition to the code, the samples include some metadata: repo name, path, language, license, and the size of the file.
21
-
22
  For model-specific information about the pretraining dataset, please select a model below:
 
17
  }
18
 
19
  ```
20
+ You can see that in addition to the code, the samples include some metadata: repo name, path, language, license, and the size of the file. Below is the distribution of programming languages in this dataset.
21
+ ![dataset-statistics](https://huggingface.co/datasets/lvwerra/github-code/resolve/main/github-code-stats-alpha.png)
22
  For model-specific information about the pretraining dataset, please select a model below: