ammarnasr commited on
Commit
d64e7af
·
1 Parent(s): 34acf73

Create README.md

Browse files
Files changed (1) hide show
  1. README.md +57 -0
README.md ADDED
@@ -0,0 +1,57 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ license: mit
3
+ datasets:
4
+ - bigcode/the-stack
5
+ library_name: adapter-transformers
6
+ tags:
7
+ - code
8
+ ---
9
+
10
+
11
+ # CodeGen (CodeGen-Mono 350M LoRa Java)
12
+
13
+ ## Model description
14
+ CodeGen LoRa Java is a family of autoregressive language models fine-tuned using LoRa on Different Programming Langauges.
15
+ ## Training data
16
+ <!-- https://huggingface.co/datasets/ammarnasr/the-stack-java-clean -->
17
+ This model was fine-tuned on the cleaned Java subset from TheStack Avilable [here](https://huggingface.co/datasets/ammarnasr/the-stack-java-clean). The data consists of 1 Million Java code files.
18
+
19
+ ## Training procedure
20
+
21
+ This model was fine-tuned using LoRa on 1 T4 GPU. The model was trained for 10,000 steps with batch size of 4. The model was trained using causal language modeling loss.
22
+
23
+ ## Evaluation results
24
+
25
+ We evaluate our models on the MultiPle-E bencchmark. The model achieves 8.9 Pass@10 Rate.
26
+
27
+
28
+ ## Intended Use and Limitations
29
+
30
+ However, the model is intended for and best at **program synthesis**, that is, generating executable code given English prompts, where the prompts should be in the form of a comment string. The model can complete partially-generated code in Java and Python.
31
+
32
+ ## How to use
33
+
34
+ This model can be easily loaded using the `AutoModelForCausalLM` functionality:
35
+
36
+ ```python
37
+ from transformers import AutoTokenizer, AutoModelForCausalLM
38
+ tokenizer = AutoTokenizer.from_pretrained("ammmarnasr/codegen-350M-mono-java")
39
+ model = AutoModelForCausalLM.from_pretrained("Salesforce/codegen-350M-mono")
40
+
41
+ text = "def hello_world():"
42
+ input_ids = tokenizer(text, return_tensors="pt").input_ids
43
+
44
+ generated_ids = model.generate(input_ids, max_length=128)
45
+ print(tokenizer.decode(generated_ids[0], skip_special_tokens=True))
46
+ ```
47
+
48
+ ## BibTeX entry and citation info
49
+
50
+ ```bibtex
51
+ @article{Nijkamp2022ACP,
52
+ title={A Conversational Paradigm for Program Synthesis},
53
+ author={Nijkamp, Erik and Pang, Bo and Hayashi, Hiroaki and Tu, Lifu and Wang, Huan and Zhou, Yingbo and Savarese, Silvio and Xiong, Caiming},
54
+ journal={arXiv preprint},
55
+ year={2022}
56
+ }
57
+ ```