loubnabnl HF staff commited on
Commit
065248a
1 Parent(s): 646ad1c

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +6 -6
README.md CHANGED
@@ -5,7 +5,7 @@ tags:
5
  - gpt2
6
  - generation
7
  datasets:
8
- - lvwerra/codeparrot-clean-train
9
  widget:
10
  - text: "from transformer import"
11
  example_title: "Transformers"
@@ -48,8 +48,8 @@ You can load the CodeParrot model and tokenizer directly in `transformers`:
48
  ```Python
49
  from transformers import AutoTokenizer, AutoModelWithLMHead
50
 
51
- tokenizer = AutoTokenizer.from_pretrained("lvwerra/codeparrot")
52
- model = AutoModelWithLMHead.from_pretrained("lvwerra/codeparrot")
53
 
54
  inputs = tokenizer("def hello_world():", return_tensors="pt")
55
  outputs = model(**inputs)
@@ -60,13 +60,13 @@ or with a `pipeline`:
60
  ```Python
61
  from transformers import pipeline
62
 
63
- pipe = pipeline("text-generation", model="lvwerra/codeparrot")
64
  outputs = pipe("def hello_world():")
65
  ```
66
 
67
  ## Training
68
 
69
- The model was trained on the cleaned [CodeParrot 🦜 dataset](https://huggingface.co/datasets/lvwerra/codeparrot-clean) in two steps. After the initial training (v1.0) the model was trained for another 30k steps resulting in v1.1 and you find the settings in the following table:
70
 
71
  |Config| v1.0| v1.1|
72
  |------|------------------|--------------------|
@@ -96,6 +96,6 @@ The [pass@k metric](https://huggingface.co/metrics/code_eval) tells the probabil
96
 
97
  ## Resources
98
 
99
- - Dataset: [full](https://huggingface.co/datasets/lvwerra/codeparrot-clean), [train](https://huggingface.co/datasets/lvwerra/codeparrot-clean-train), [valid](https://huggingface.co/datasets/lvwerra/codeparrot-clean-valid)
100
  - Code: [repository](https://github.com/huggingface/transformers/tree/master/examples/research_projects/codeparrot)
101
  - Spaces: [generation](), [highlighting]()
5
  - gpt2
6
  - generation
7
  datasets:
8
+ - codeparrot/codeparrot-clean-train
9
  widget:
10
  - text: "from transformer import"
11
  example_title: "Transformers"
48
  ```Python
49
  from transformers import AutoTokenizer, AutoModelWithLMHead
50
 
51
+ tokenizer = AutoTokenizer.from_pretrained("codeparrot/codeparrot")
52
+ model = AutoModelWithLMHead.from_pretrained("codeparrot/codeparrot")
53
 
54
  inputs = tokenizer("def hello_world():", return_tensors="pt")
55
  outputs = model(**inputs)
60
  ```Python
61
  from transformers import pipeline
62
 
63
+ pipe = pipeline("text-generation", model="codeparrot/codeparrot")
64
  outputs = pipe("def hello_world():")
65
  ```
66
 
67
  ## Training
68
 
69
+ The model was trained on the cleaned [CodeParrot 🦜 dataset](https://huggingface.co/datasets/codeparrot/codeparrot-clean) in two steps. After the initial training (v1.0) the model was trained for another 30k steps resulting in v1.1 and you find the settings in the following table:
70
 
71
  |Config| v1.0| v1.1|
72
  |------|------------------|--------------------|
96
 
97
  ## Resources
98
 
99
+ - Dataset: [full](https://huggingface.co/datasets/codeparrot/codeparrot-clean), [train](https://huggingface.co/datasets/codeparrot/codeparrot-clean-train), [valid](https://huggingface.co/datasets/codeparrot/codeparrot-clean-valid)
100
  - Code: [repository](https://github.com/huggingface/transformers/tree/master/examples/research_projects/codeparrot)
101
  - Spaces: [generation](), [highlighting]()