ammarnasr
/

codegen-350M-mono-java

Text Generation

Inference Endpoints

Model card Files Files and versions Community

ammarnasr commited on Aug 14, 2023

Commit

d64e7af

·

1 Parent(s): 34acf73

Create README.md

Files changed (1) hide show

README.md +57 -0

README.md ADDED Viewed

	@@ -0,0 +1,57 @@

+---
+license: mit
+datasets:
+- bigcode/the-stack
+library_name: adapter-transformers
+tags:
+- code
+---
+# CodeGen (CodeGen-Mono 350M LoRa Java)
+## Model description
+CodeGen LoRa Java is a family of autoregressive language models fine-tuned using LoRa on Different Programming Langauges.
+## Training data
+<!-- https://huggingface.co/datasets/ammarnasr/the-stack-java-clean -->
+This model was fine-tuned on the cleaned Java subset from TheStack Avilable [here](https://huggingface.co/datasets/ammarnasr/the-stack-java-clean). The data consists of 1 Million Java code files.
+## Training procedure
+This model was fine-tuned using LoRa on 1 T4 GPU. The model was trained for 10,000 steps with batch size of 4. The model was trained using causal language modeling loss.
+## Evaluation results
+We evaluate our models on the MultiPle-E bencchmark. The model achieves 8.9 Pass@10 Rate.
+## Intended Use and Limitations
+However, the model is intended for and best at **program synthesis**, that is, generating executable code given English prompts, where the prompts should be in the form of a comment string. The model can complete partially-generated code in Java and Python.
+## How to use
+This model can be easily loaded using the `AutoModelForCausalLM` functionality:
+```python
+from transformers import AutoTokenizer, AutoModelForCausalLM
+tokenizer = AutoTokenizer.from_pretrained("ammmarnasr/codegen-350M-mono-java")
+model = AutoModelForCausalLM.from_pretrained("Salesforce/codegen-350M-mono")
+text = "def hello_world():"
+input_ids = tokenizer(text, return_tensors="pt").input_ids
+generated_ids = model.generate(input_ids, max_length=128)
+print(tokenizer.decode(generated_ids[0], skip_special_tokens=True))
+```
+## BibTeX entry and citation info
+```bibtex
+@article{Nijkamp2022ACP,
+  title={A Conversational Paradigm for Program Synthesis},
+  author={Nijkamp, Erik and Pang, Bo and Hayashi, Hiroaki and Tu, Lifu and Wang, Huan and Zhou, Yingbo and Savarese, Silvio and Xiong, Caiming},
+  journal={arXiv preprint},
+  year={2022}
+}
+```