lvwerra HF staff commited on
Commit
23d07ce
1 Parent(s): 228a630

Create README.md

Browse files
Files changed (1) hide show
  1. README.md +62 -0
README.md ADDED
@@ -0,0 +1,62 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ # CodeParrot 🦜
2
+
3
+ CodeParrot 🦜 is a GPT-2 model (1.5B parameters) trained to generate Python code.
4
+
5
+ ## Usage
6
+
7
+ You can load the CodeParrot model and tokenizer directly in `transformers`:
8
+
9
+ ```Python
10
+ from transformers import AutoTokenizer, AutoModelWithLMHead
11
+
12
+ tokenizer = AutoTokenizer.from_pretrained("lvwerra/codeparrot")
13
+ model = AutoModelWithLMHead.from_pretrained("lvwerra/codeparrot")
14
+
15
+ inputs = tokenizer("def hello_world():", return_tensors="pt")
16
+ outputs = model(**inputs)
17
+ ```
18
+
19
+ or with a `pipeline`:
20
+
21
+ ```Python
22
+ from transformers import pipeline
23
+
24
+ pipe = pipeline("text-generation", model="lvwerra/codeparrot")
25
+ outputs = pipe("def hello_world():")
26
+ ```
27
+
28
+ ## Training
29
+
30
+ The model was trained on the cleaned [CodeParrot 🦜 dataset](https://huggingface.co/datasets/lvwerra/codeparrot-clean) with the following settings:
31
+
32
+ |Config|Value|
33
+ |-------|-----|
34
+ |Batch size| 512|
35
+ |Context size| 1024 |
36
+ |Training steps| 50'000|
37
+ |Gradient accumulation| 16|
38
+ |Gradient checkpointing| True|
39
+ |Learning rate| 2e-4 |
40
+ |Weight decay | 0.1 |
41
+ |Warmup steps| 750 |
42
+ |Schedule| Cosine |
43
+
44
+ The training was executed on 16 x A100 (40GB) GPUs. This setting amounts to roughly 26 billion tokens.
45
+
46
+ ## Performance
47
+
48
+ We evaluated the model on OpenAI's [HumanEval](https://huggingface.co/datasets/openai_humaneval) benchmark which consists of programming challenges:
49
+
50
+ | Metric | Value |
51
+ |-------|-----|
52
+ |pass@1 | 3.58% |
53
+ |pass@10 | 8.03% |
54
+ |pass@100 | 14.96% |
55
+
56
+ The [pass@k metric](https://huggingface.co/metrics/code_eval) tells the probability that at least one out of k generations passes the tests.
57
+
58
+ ## Resources
59
+
60
+ - Dataset: [full](https://huggingface.co/datasets/lvwerra/codeparrot-clean), [train](https://huggingface.co/datasets/lvwerra/codeparrot-clean-train), [valid](https://huggingface.co/datasets/lvwerra/codeparrot-clean-valid)
61
+ - Code: [repository](https://github.com/huggingface/transformers/tree/master/examples/research_projects/codeparrot)
62
+ - Spaces: [generation](), [highlighting]()