CodeBERTa-ft-coco-1e-05lr

Model for the paper "A Transformer-Based Approach for Smart Invocation of Automatic Code Completion".

Description

This model is fine-tuned on a code-completion dataset collected from the open-source Code4Me plugin. The training objective is to have a small, lightweight transformer model to filter out unnecessary and unhelpful code completions. To this end, we leverage the in-IDE telemetry data, and integrate it with the textual code data in the transformer's attention module.

Models are named as follows:

  • CodeBERTa โ†’ CodeBERTa-ft-coco-[1,2,5]e-05lr
    • e.g. CodeBERTa-ft-coco-2e-05lr, which was trained with learning rate of 2e-05.
  • JonBERTa-head โ†’ JonBERTa-head-ft-[dense,proj,reinit]
    • e.g. JonBERTa-head-ft-dense-proj, where all have 2e-05 learning rate, but may differ in the head layer in which the telemetry features are introduced (either head or proj, with optional reinitialisation of all its weights).
  • JonBERTa-attn โ†’ JonBERTa-attn-ft-[0,1,2,3,4,5]L
    • e.g. JonBERTa-attn-ft-012L , where all have 2e-05 learning rate, but may differ in the attention layer(s) in which the telemetry features are introduced (either 0, 1, 2, 3, 4, or 5L).

Other hyperparameters may be found in the paper or the replication package (see below).

Sources

To cite, please use

@misc{de_moor_smart_invocation_2024,
    title = {A {Transformer}-{Based} {Approach} for {Smart} {Invocation} of {Automatic} {Code} {Completion}},
    url = {http://arxiv.org/abs/2405.14753},
    doi = {10.1145/3664646.3664760},
    author = {de Moor, Aral and van Deursen, Arie and Izadi, Maliheh},
    month = may,
    year = {2024},
}

Training Details

This model was trained with the following hyperparameters, everything else being TrainingArguments' default. The dataset was prepared identically across all models as detailed in the paper.

num_train_epochs : int = 6
learning_rate    : float = search([2e-5, 1e-5, 5e-5])
batch_size       : int = 16
Downloads last month
20
Safetensors
Model size
83.5M params
Tensor type
F32
ยท
Inference Examples
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social visibility and check back later, or deploy to Inference Endpoints (dedicated) instead.

Collection including AISE-TUDelft/CodeBERTa-ft-coco-1e-05lr