---
language: code
tags:
- code
- gpt2
- generation
datasets:
- giulio98/xlcost-single-prompt
widget:
- text: "'''\nfunction to add two numbers\n'''\n###\n"
  example_title: "add two numbers"
model-index:
- name: codegen-350M-multi-xlcost
  results:
  - task: 
      name: Code Generation
      type: code-generation
    dataset:
      name: "XLCost" 
      type: code_eval_outputs
    metrics:
       - name: pass@1
         type: code_eval_outputs
         value: 3.325
       - name: pass@10
         type: code_eval_outputs
         value: 15
       - name: codebleu
         type: codebleu
         value: 20.18191
---

# CodeGen-350M-multi-xlcost-v2

CodeGen-350M-multi-xlcost is a CodeGen model fine-tuned on the Python split of XLCost dataset using Deepspeed.

## Usage

You can load the CodeGen-350M-multi-xlcost-v2 model and tokenizer directly in `transformers`:

```Python
from transformers import AutoTokenizer, AutoModelForCausalLM
tokenizer = AutoTokenizer.from_pretrained("giulio98/codegen-350M-multi-xlcost-v2")
model = AutoModelForCausalLM.from_pretrained("giulio98/codegen-350M-multi-xlcost-v2")

text = tokenizer.eos_token + "\'\'\'\n" + "function to add two numbers" + "\n\'\'\'\n" + "###\n"
input_ids = tokenizer(text, return_tensors="pt").input_ids

generated_ids = model.generate(input_ids, max_length=128)
print(tokenizer.decode(generated_ids[0], skip_special_tokens=True))
```
Output:
```Python
'''
function to add two numbers 
'''
###
def add(a, b):
    return a + b
```
## Training

The model was finetuned on [XLCost-single-prompt](https://huggingface.co/datasets/giulio98/xlcost-single-prompt), an improved version of the original XLCost dataset [
xlcost-text-to-code](https://huggingface.co/datasets/codeparrot/xlcost-text-to-code). Below the hyperparameters.

| Hyperparameter | value |
|---------------------------|--------|
|Per device train batch size| 16 |
|Context size| 1024 |
|Training steps| 259|
|Gradient accumulation| 2|
|Gradient checkpointing| True|
|Learning rate|1.8e-05 |
|Weight decay | 0.1 |
|Warmup steps| 35 |
|Schedule| linear |
|zero stage| 2 |

Below the deepspeed configuration
```Python
{
  "fp16": {
    "enabled": true,
    "loss_scale": 0,
    "loss_scale_window": 1000,
    "initial_scale_power": 16,
    "hysteresis": 2,
    "min_loss_scale": 1
  },
  "optimizer": {
    "type": "AdamW",
    "params": {
      "lr": 0.000018,
      "betas": [
        0.9,
        0.999
      ],
      "eps": 1e-8,
      "weight_decay": 0.1
    }
  },
  "scheduler": {
    "type": "WarmupLR",
    "params": {
      "warmup_min_lr": 0,
      "warmup_max_lr": 0.000018,
      "warmup_num_steps": 35
    }
  },
  "zero_optimization": {
    "stage": 2,
    "offload_optimizer": {
      "device": "cpu",
      "pin_memory": false
    },
    "allgather_partitions": true,
    "allgather_bucket_size": 200000000,
    "overlap_comm": true,
    "reduce_scatter": true,
    "reduce_bucket_size": 200000000,
    "contiguous_gradients": true
  },
  "gradient_accumulation_steps": 2,
  "train_batch_size": 32,
  "train_micro_batch_size_per_gpu": 16,
  "gradient_clipping": 1,
  "wall_clock_breakdown": false
}
```

The training was executed on 1 x V100 (16GB) GPU for 28min 50sec

## Performance

We evaluated the model on the first 400 samples of XLCOST's [XLCost-single-prompt test split](https://huggingface.co/datasets/giulio98/xlcost-single-prompt/viewer/Python/test) and comparing the outputs of the generated codes with respect to the expected output using pass@k metric.

| Metric | codegen-350M-multi-xlcost-v2 | codegen-350M-multi-xlcost | codegen-350M-mono(zero-shot) | codegen-350M-mono (one-shot) | codegen-350M-mono(few-shot)
|--------|-----|-----|-----|-----|-----|
|pass@1 |3.325% |3.70% | 0.4% | 0.35% | 0.48% |
|pass@10 |15%| 14.5% | 3.5% | 3 % | 3.75% |
|CodeBLEU |20.18%| None | 15.15% | 19.42 % | 20.27% |

The [pass@k metric](https://huggingface.co/metrics/code_eval) tells the probability that at least one out of k generations passes the tests. 

## Citations
```
@article{Nijkamp2022ACP,
  title={A Conversational Paradigm for Program Synthesis},
  author={Nijkamp, Erik and Pang, Bo and Hayashi, Hiroaki and Tu, Lifu and Wang, Huan and Zhou, Yingbo and Savarese, Silvio and Xiong, Caiming},
  journal={arXiv preprint},
  year={2022}
}
```