lvwerra HF staff Muennighoff commited on
Commit
eda1407
1 Parent(s): b9a94ab

Add OctoPack (#4)

Browse files

- Add OctoPack (b022c0a75645e7b0e258b060fdc6107f4c895e2e)
- Update README.md (c32c7ba55711c543617dec7ed9854d21bc43bdf2)


Co-authored-by: Niklas Muennighoff <Muennighoff@users.noreply.huggingface.co>

Files changed (1) hide show
  1. README.md +14 -0
README.md CHANGED
@@ -47,6 +47,20 @@ StarCoder is a 15.5B parameters language model for code trained for 1T tokens on
47
  - [StarCoder Search](https://huggingface.co/spaces/bigcode/search): Full-text search code in the pretraining dataset.
48
  - [StarCoder Membership Test](https://stack.dataportraits.org/): Blazing fast test if code was present in pretraining dataset.
49
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
50
 
51
  ---
52
 
 
47
  - [StarCoder Search](https://huggingface.co/spaces/bigcode/search): Full-text search code in the pretraining dataset.
48
  - [StarCoder Membership Test](https://stack.dataportraits.org/): Blazing fast test if code was present in pretraining dataset.
49
 
50
+ ---
51
+
52
+ ## 🐙OctoPack
53
+ OctoPack consists of data, evals & models relating to Code LLMs that follow human instructions.
54
+
55
+ - [Paper](https://arxiv.org/abs/2308.07124): Research paper with details about all components of OctoPack.
56
+ - [GitHub](https://github.com/bigcode-project/octopack): All code used for the creation of OctoPack.
57
+ - [CommitPack](https://huggingface.co/datasets/bigcode/commitpack): 4TB of Git commits.
58
+ - [Am I in the CommitPack](https://huggingface.co/spaces/bigcode/in-the-commitpack): Check if your code is in the CommitPack.
59
+ - [CommitPackFT](https://huggingface.co/datasets/bigcode/commitpackft): 2GB of high-quality Git commits that resemble instructions.
60
+ - [HumanEvalPack](https://huggingface.co/datasets/bigcode/humanevalpack): Benchmark for Code Fixing/Explaining/Synthesizing across Python/JavaScript/Java/Go/C++/Rust.
61
+ - [OctoCoder](https://huggingface.co/bigcode/octocoder): Instruction tuned model of StarCoder by training on CommitPackFT.
62
+ - [OctoCoder Demo](https://huggingface.co/spaces/bigcode/OctoCoder-Demo): Play with OctoCoder.
63
+ - [OctoGeeX](https://huggingface.co/bigcode/octogeex): Instruction tuned model of CodeGeeX2 by training on CommitPackFT.
64
 
65
  ---
66