---
language:
- code
license: apache-2.0
widget:
- text: public [MASK] isOdd(Integer num) {if (num % 2 == 0) {return "even";} else
    {return "odd";}}
---

# Model Card for JavaBERT
 
A BERT-like model pretrained on Java software code.
 
 
 
 
 
 
# Model Details
 
## Model Description
 
A BERT-like model pretrained on Java software code.
 
- **Developed by:** Christian-Albrechts-University of Kiel (CAUKiel)
- **Shared by [Optional]:** Hugging Face
- **Model type:** Fill-Mask
- **Language(s) (NLP):** en
- **License:** Apache-2.0
- **Related Models:** A version of this model using an uncased tokenizer is available at [CAUKiel/JavaBERT-uncased](https://huggingface.co/CAUKiel/JavaBERT-uncased).
  - **Parent Model:** BERT
- **Resources for more information:** 
  - [Associated Paper](https://arxiv.org/pdf/2110.10404.pdf)
 
 
# Uses
 
## Direct Use
 
Fill-Mask
 
## Downstream Use [Optional]
 
More information needed.
 
## Out-of-Scope Use
 
The model should not be used to intentionally create hostile or alienating environments for people. 
 
# Bias, Risks, and Limitations
 
Significant research has explored bias and fairness issues with language models (see, e.g., [Sheng et al. (2021)](https://aclanthology.org/2021.acl-long.330.pdf) and [Bender et al. (2021)](https://dl.acm.org/doi/pdf/10.1145/3442188.3445922)). Predictions generated by the model may include disturbing and harmful stereotypes across protected classes; identity characteristics; and sensitive, social, and occupational groups.
 
 
## Recommendations
 
Users (both direct and downstream) should be made aware of the risks, biases and limitations of the model. More information needed for further recommendations.
{ see paper= word something)
 
# Training Details
 
## Training Data
The model was trained on 2,998,345 Java files retrieved from open source projects on GitHub. A ```bert-base-cased``` tokenizer is used by this model.
 
## Training Procedure
 
 
### Training Objective
A MLM (Masked Language Model) objective was used to train this model.
 
### Preprocessing
 
More information needed.
 
 
### Speeds, Sizes, Times
 
More information needed.
 
# Evaluation
 
 
 
## Testing Data, Factors & Metrics
 
### Testing Data
More information needed.
 
 
### Factors
 

 
### Metrics
 
More information needed.
 
 
## Results 
More information needed.
 
 
# Model Examination
 
More information needed.
 
# Environmental Impact
 
 
Carbon emissions can be estimated using the [Machine Learning Impact calculator](https://mlco2.github.io/impact#compute) presented in [Lacoste et al. (2019)](https://arxiv.org/abs/1910.09700).
 
- **Hardware Type:** More information needed.
- **Hours used:** More information needed.
- **Cloud Provider:** More information needed.
- **Compute Region:** More information needed.
- **Carbon Emitted:** More information needed.
 
# Technical Specifications [optional]
 
## Model Architecture and Objective
 
More information needed.
 
## Compute Infrastructure
 
More information needed.
 
### Hardware
 
More information needed.
 
### Software
 
More information needed.
 
# Citation
 
 
 
**BibTeX:**

```
@inproceedings{De_Sousa_Hasselbring_2021,
  address={Melbourne, Australia},
  title={JavaBERT: Training a Transformer-Based Model for the Java Programming Language},
  rights={https://ieeexplore.ieee.org/Xplorehelp/downloads/license-information/IEEE.html},
  ISBN={9781665435833},
  url={https://ieeexplore.ieee.org/document/9680322/},
  DOI={10.1109/ASEW52652.2021.00028},
  booktitle={2021 36th IEEE/ACM International Conference on Automated Software Engineering Workshops (ASEW)},
  publisher={IEEE},
  author={Tavares de Sousa, Nelson and Hasselbring, Wilhelm},
  year={2021},
  month=nov,
  pages={90–95} }
```
 
**APA:**
 
More information needed.
 
# Glossary [optional]
More information needed.
 
# More Information [optional]
 
More information needed.
 
# Model Card Authors [optional]
 
Christian-Albrechts-University of Kiel (CAUKiel)  in collaboration with Ezi Ozoani and the team at Hugging Face
 
# Model Card Contact
 
More information needed.
 
# How to Get Started with the Model
 
Use the code below to get started with the model.
 
<details>
<summary> Click to expand </summary>

 ```python
from transformers import pipeline
pipe = pipeline('fill-mask', model='CAUKiel/JavaBERT')
output = pipe(CODE) # Replace with Java code; Use '[MASK]' to mask tokens/words in the code.
```
 
</details>