codeparrot
/

codeparrot

Text Generation

Inference Endpoints

text-generation-inference

Model card Files Files and versions Metrics Training metrics Community

loubnabnl HF staff commited on Jun 24, 2022

Commit

065248a

•

1 Parent(s): 646ad1c

Update README.md

Files changed (1) hide show

README.md +6 -6

README.md CHANGED Viewed

@@ -5,7 +5,7 @@ tags:
 - gpt2
 - generation
 datasets:
-- lvwerra/codeparrot-clean-train
 widget:
 - text: "from transformer import"
   example_title: "Transformers"
@@ -48,8 +48,8 @@ You can load the CodeParrot model and tokenizer directly in `transformers`:
 ```Python
 from transformers import AutoTokenizer, AutoModelWithLMHead
-tokenizer = AutoTokenizer.from_pretrained("lvwerra/codeparrot")
-model = AutoModelWithLMHead.from_pretrained("lvwerra/codeparrot")
 inputs = tokenizer("def hello_world():", return_tensors="pt")
 outputs = model(**inputs)
@@ -60,13 +60,13 @@ or with a `pipeline`:
 ```Python
 from transformers import pipeline
-pipe = pipeline("text-generation", model="lvwerra/codeparrot")
 outputs = pipe("def hello_world():")
 ```
 ## Training
-The model was trained on the cleaned [CodeParrot 🦜 dataset](https://huggingface.co/datasets/lvwerra/codeparrot-clean) in two steps. After the initial training (v1.0) the model was trained for another 30k steps resulting in v1.1 and you find the settings in the following table:
 |Config| v1.0| v1.1|
 |------|------------------|--------------------|
@@ -96,6 +96,6 @@ The [pass@k metric](https://huggingface.co/metrics/code_eval) tells the probabil
 ## Resources
-- Dataset: [full](https://huggingface.co/datasets/lvwerra/codeparrot-clean), [train](https://huggingface.co/datasets/lvwerra/codeparrot-clean-train), [valid](https://huggingface.co/datasets/lvwerra/codeparrot-clean-valid)
 - Code: [repository](https://github.com/huggingface/transformers/tree/master/examples/research_projects/codeparrot)
 - Spaces: [generation](), [highlighting]()

 - gpt2
 - generation
 datasets:
+- codeparrot/codeparrot-clean-train
 widget:
 - text: "from transformer import"
   example_title: "Transformers"
 ```Python
 from transformers import AutoTokenizer, AutoModelWithLMHead
+tokenizer = AutoTokenizer.from_pretrained("codeparrot/codeparrot")
+model = AutoModelWithLMHead.from_pretrained("codeparrot/codeparrot")
 inputs = tokenizer("def hello_world():", return_tensors="pt")
 outputs = model(**inputs)
 ```Python
 from transformers import pipeline
+pipe = pipeline("text-generation", model="codeparrot/codeparrot")
 outputs = pipe("def hello_world():")
 ```
 ## Training
+The model was trained on the cleaned [CodeParrot 🦜 dataset](https://huggingface.co/datasets/codeparrot/codeparrot-clean) in two steps. After the initial training (v1.0) the model was trained for another 30k steps resulting in v1.1 and you find the settings in the following table:
 |Config| v1.0| v1.1|
 |------|------------------|--------------------|
 ## Resources
+- Dataset: [full](https://huggingface.co/datasets/codeparrot/codeparrot-clean), [train](https://huggingface.co/datasets/codeparrot/codeparrot-clean-train), [valid](https://huggingface.co/datasets/codeparrot/codeparrot-clean-valid)
 - Code: [repository](https://github.com/huggingface/transformers/tree/master/examples/research_projects/codeparrot)
 - Spaces: [generation](), [highlighting]()