codebert-java /
urialon's picture

This is a microsoft/codebert-base-mlm model, trained for 1,000,000 steps (with batch_size=32) on Java code from the codeparrot/github-code-clean dataset, on the masked-language-modeling task.

It is intended to be used in CodeBERTScore:, but can be used for any other model or task.

For more information, see:


If you use this model for research, please cite:

  url = {},
  author = {Zhou, Shuyan and Alon, Uri and Agarwal, Sumit and Neubig, Graham},
  title = {CodeBERTScore: Evaluating Code Generation with Pretrained Models of Code},  
  publisher = {arXiv},
  year = {2023},