Deci
/

DeciCoder-1b / README.md
leaderboard-pr-bot's picture
Adding Evaluation Results
7e688b3
|
raw
history blame
6.35 kB
metadata
pipeline_tag: text-generation
license: apache-2.0
tags:
  - text generation
  - Deci AI
  - DeciCoder
programming_language:
  - Java
  - JavaScript
  - Python
metrics:
  - code_eval
inference: true
widget:
  - text: 'def print_hello_world():'
    example_title: Hello world
    group: Python
model-index:
  - name: DeciCoder-1b
    results:
      - task:
          type: text-generation
        dataset:
          type: nuprl/MultiPL-E
          name: MultiPL-HumanEval (Python)
        metrics:
          - name: pass@1
            type: pass@1
            value: 0.191
            verified: false
      - task:
          type: text-generation
        dataset:
          type: nuprl/MultiPL-E
          name: MultiPL-HumanEval (JavaScript)
        metrics:
          - name: pass@1
            type: pass@1
            value: 0.184
            verified: false
      - task:
          type: text-generation
        dataset:
          type: nuprl/MultiPL-E
          name: MultiPL-HumanEval (Java)
        metrics:
          - name: pass@1
            type: pass@1
            value: 0.166
            verified: false
datasets:
  - bigcode/starcoderdata

Model Card for DeciCoder 1B

DeciCoder 1B is a 1 billion parameter decoder-only code completion model trained on the Python, Java, and Javascript subsets of Starcoder Training Dataset. The model uses Grouped Query Attention and has a context window of 2048 tokens. It was trained using a Fill-in-the-Middle training objective. The model's architecture was generated by Deci's proprietary Neural Architecture Search-based technology, AutoNAC.

Model Details

  • Developed by: Deci
  • Model type: DeciCoder is an auto-regressive language model based on the transformer decoder architecture, using Grouped Query Attention.
  • Language(s): Python, Java, JavaScript
  • License: Model checkpoints are licensed under the Apache 2.0

Model Architecture

Parameters Layers Heads Sequence Length GQA num_key_value_heads Hidden Size
1.1B 20 32 2048 4 2048

Uses

The model is intended to do single/multiline code completion from a context window of up to 2048k tokens. It is not an instruction model and commands like "Write a function that computes the absolute value of an integer," won't yield the desired results. A more effective approach is to frame instructions in the style of source code comments (e.g. # this function calculates the absolute value of an integer) or to present a function signature and docstring, enabling the model to complete the function's body.

How to Use

# pip install -q transformers
import torch
from transformers import AutoModelForCausalLM, AutoTokenizer

checkpoint = "Deci/DeciCoder-1b"
device = "cuda" # for GPU usage or "cpu" for CPU usage

tokenizer = AutoTokenizer.from_pretrained(checkpoint)
model = AutoModelForCausalLM.from_pretrained(checkpoint, torch_dtype=torch.bfloat16, trust_remote_code=True).to(device)

inputs = tokenizer.encode("def print_hello_world():", return_tensors="pt").to(device)
outputs = model.generate(inputs, max_new_tokens=100)
print(tokenizer.decode(outputs[0]))

Attribution

DeciCoder was trained on StarCoder Training Dataset, filtered for Python, Java, and Javascript code. For additional information, please refer to https://huggingface.co/datasets/bigcode/starcoderdata.

Limitations

The model has undergone training with source code from Python, Java, and JavaScript. While the primary language in the source is English, it does contain other languages. Therefore, the model can produce code snippets given some context. However, there's no assurance that the resulting code will function as expected. It might be suboptimal, contain bugs, or even exploits.

Training Details

Training Data

DeciCoder was trained on the Python, Java, and Javascript subsets of Starcoder Training Dataset

Training Procedure

  • Warm-Up Steps: 9000
  • Total Training Steps: 284k
  • Total Tokens: 446B
  • Global Batch Size: 768
  • Optimizer: AdamW
  • Optimizer Parameters: beta1=0.9, beta2=0.95
  • Weight Decay: 0.1
  • Learning Rate: 4e-4
  • Learning Rate Schedule: cosine

Evaluation

Below are DeciCoder's pass@1 on MultiPL HumanEval scores

Python JavaScript Java
19.1% 18.4% 16.6%

Runtime Benchmarks

Inference Tool/Hardware A10 (tokens/sec) A100 (tokens/sec)
PyTorch 1,364.2 3,244.4
Infery LLM 3,889.3 11,676.8
  • Throughput (tokens/sec) - Measured with optimal batch size per hardware - A10 on BS 128, A100 on BS 512

Documentation

How to Cite

Please cite this model using this format.

@misc{DeciFoundationModels,
title = {DeciCoder},
author = {DeciAI Research Team},
year = {2023}
url={[https://huggingface.co/deci/decicoder-1b](https://huggingface.co/deci/decicoder-1b)},
}

Open LLM Leaderboard Evaluation Results

Detailed results can be found here

Metric Value
Avg. 25.6
ARC (25-shot) 21.16
HellaSwag (10-shot) 31.09
MMLU (5-shot) 24.34
TruthfulQA (0-shot) 47.05
Winogrande (5-shot) 50.83
GSM8K (5-shot) 1.74
DROP (3-shot) 2.98