File size: 575 Bytes
d6294a8
 
 
 
 
 
f34e776
 
 
 
d6294a8
f34e776
 
d6294a8
f34e776
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
---
language:
  - java
  - code
license: apache-2.0
---
## JavaBERT
A BERT-like model pretrained on Java software code.

### Training Data
The model is trained on 2,998,345 Java files retrieved from open source projects on GitHub. A ```bert-base-cased``` tokenizer was used.

### Training Objective
A MLM (Masked Language Model) objective was used to train this model.

### Usage
```python
from transformers import pipeline
pipe = pipeline('fill-mask', model='CAUKiel/JavaBERT')
output = pipe(CODE) # Replace with Java code; Use '[MASK]' to mask tokens/words in the code.
```