dbernsohn commited on
Commit
bd56a33
1 Parent(s): e00ef17

Create README.md

Browse files
Files changed (1) hide show
  1. README.md +61 -0
README.md ADDED
@@ -0,0 +1,61 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ # roberta-go
2
+ ---
3
+ language: Go
4
+ datasets:
5
+ - CodeSearchNet
6
+ ---
7
+
8
+ This is a [roberta](https://arxiv.org/pdf/1907.11692.pdf) pre-trained version on the [CodeSearchNet dataset](https://github.com/github/CodeSearchNet) for **Golang** Mask Language Model mission.
9
+
10
+ To load the model:
11
+ (necessary packages: !pip install transformers sentencepiece)
12
+ ```python
13
+ from transformers import AutoTokenizer, AutoModelWithLMHead, pipeline
14
+ tokenizer = AutoTokenizer.from_pretrained("dbernsohn/roberta-go")
15
+ model = AutoModelWithLMHead.from_pretrained("dbernsohn/roberta-go")
16
+
17
+ fill_mask = pipeline(
18
+ "fill-mask",
19
+ model=model,
20
+ tokenizer=tokenizer
21
+ )
22
+ ```
23
+
24
+ You can then use this model to fill masked words in a Java code.
25
+
26
+ ```go
27
+ code = """
28
+ package main
29
+
30
+ import (
31
+ "fmt"
32
+ "runtime"
33
+ )
34
+
35
+ func main() {
36
+ fmt.Print("Go runs on ")
37
+ switch os := runtime.<mask>; os {
38
+ case "darwin":
39
+ fmt.Println("OS X.")
40
+ case "linux":
41
+ fmt.Println("Linux.")
42
+ default:
43
+ // freebsd, openbsd,
44
+ // plan9, windows...
45
+ fmt.Printf("%s.\n", os)
46
+ }
47
+ }
48
+ """.lstrip()
49
+
50
+ pred = {x["token_str"].replace("Ġ", ""): x["score"] for x in fill_mask(code)}
51
+ sorted(pred.items(), key=lambda kv: kv[1], reverse=True)
52
+ [('GOOS', 0.11810332536697388),
53
+ ('FileInfo', 0.04276798665523529),
54
+ ('Stdout', 0.03572738170623779),
55
+ ('Getenv', 0.025064032524824142),
56
+ ('FileMode', 0.01462600938975811)]
57
+ ```
58
+
59
+ The whole training process and hyperparameters are in my [GitHub repo](https://github.com/DorBernsohn/CodeLM/tree/main/CodeMLM)
60
+
61
+ > Created by [Dor Bernsohn](https://www.linkedin.com/in/dor-bernsohn-70b2b1146/)