MultiPLCoder-15b

15 billion parameter version of MultiPLCoder, a set of StarCoder-based models finetuned on the MultiPL-T dataset. These models are state-of-the-art at low-resource languages, such as: Lua, Racket, and OCaml.

This 15 billion parameter model is the most capable of the MultiPLCoder family. However, it requires a dedicated GPU for inference. For a smaller model that fits on the CPU, check out MultiPLCoder-1b.

Language Revision Index

This is the revision index for the best-performing models for their respective langauge.

Langauge Revision ID Epoch
Lua 6069aa54dd554404dd18fccdf5dedd56b8088e74 4
Racket f0c77c06482f436f469007f20d731cb9dd73d609 8
OCaml e7babda985786810707200ff885df6105de7dc56 4

Usage

To utilize one of the models in this repository, you must first select a commit revision for that model from the table above. For example, to use the Lua model:

from transformers import AutoTokenizer, AutoModelForCausalLM
tokenizer = AutoTokenizer.from_pretrained("nuprl/MultiPLCoder-15b")
lua_revision="6069aa54dd554404dd18fccdf5dedd56b8088e74"
model = AutoModelForCausalLM.from_pretrained("nuprl/MultiPLCoder-15b", revision=lua_revision).cuda()

Note that the model's default configuration does not enable caching, therefore you must specify to use the cache on generation.

toks = tokenizer.encode("-- Fibonacci iterative", return_tensors="pt").cuda()
out = model.generate(toks, use_cache=True,  do_sample=True, temperature=0.2, top_p=0.95, max_length=256)
print(tokenizer.decode(out[0], skip_special_tokens=True))
-- Fibonacci iterative.
local function fib_iterative(n)
    if n == 0 or n == 1 then
        return n
    end
    local previous, current = 0, 1
    for _ = 2, n do
        previous, current = current, current + previous
    end
    return current
end
Downloads last month
69
Safetensors
Model size
15.5B params
Tensor type
BF16
·
Inference Examples
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social visibility and check back later, or deploy to Inference Endpoints (dedicated) instead.

Model tree for nuprl/MultiPL-T-StarCoderBase_15b

Quantizations
2 models

Dataset used to train nuprl/MultiPL-T-StarCoderBase_15b

Collection including nuprl/MultiPL-T-StarCoderBase_15b

Evaluation results