Edit model card
YAML Metadata Warning: empty or missing yaml metadata in repo card (https://huggingface.co/docs/hub/model-cards#model-card-metadata)

roberta-javascript


language: javascript datasets: - code_search_net

This is a roberta pre-trained version on the CodeSearchNet dataset for javascript Mask Language Model mission.

To load the model: (necessary packages: !pip install transformers sentencepiece)

from transformers import AutoTokenizer, AutoModelWithLMHead, pipeline
tokenizer = AutoTokenizer.from_pretrained("dbernsohn/roberta-javascript")
model = AutoModelWithLMHead.from_pretrained("dbernsohn/roberta-javascript")

fill_mask = pipeline(
    "fill-mask",
    model=model,
    tokenizer=tokenizer
)

You can then use this model to fill masked words in a Java code.

code = """
var i;
for (i = 0; i < cars.<mask>; i++) {
  text += cars[i] + "<br>";
}
""".lstrip()

pred = {x["token_str"].replace("Ġ", ""): x["score"] for x in fill_mask(code)}
sorted(pred.items(), key=lambda kv: kv[1], reverse=True)
# [('length', 0.9959614872932434),
#  ('i', 0.00027875584783032537),
#  ('len', 0.0002283261710545048),
#  ('nodeType', 0.00013731322542298585),
#  ('index', 7.5289819505997e-05)]

The whole training process and hyperparameters are in my GitHub repo

Created by Dor Bernsohn

Downloads last month
13
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social visibility and check back later, or deploy to Inference Endpoints (dedicated) instead.