mvasiliniuc
commited on
Commit
•
e5c691d
1
Parent(s):
266c594
Add initial model datacard.
Browse files
README.md
ADDED
@@ -0,0 +1,76 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
1 |
+
---
|
2 |
+
datasets:
|
3 |
+
- mvasiliniuc/iva-kotlin-codeint-clean-train
|
4 |
+
- mvasiliniuc/iva-kotlin-codeint-clean-valid
|
5 |
+
language:
|
6 |
+
- code
|
7 |
+
tags:
|
8 |
+
- gpt2
|
9 |
+
- code
|
10 |
+
- kotlin
|
11 |
+
- mobile
|
12 |
+
- generation
|
13 |
+
widget:
|
14 |
+
- text: "/**\n\t* A function that returns the version of the current operating system.\n*/\n"
|
15 |
+
example_title: "Get current device operating system"
|
16 |
+
- text: "/**\n\t* A function that returns the current TimeZone.\n*/\n"
|
17 |
+
example_title: "Get current timezone"
|
18 |
+
- text: "/**\n\t* A data class representing a Bank Account.\n*/\n"
|
19 |
+
example_title: "Data Class - BankAccount"
|
20 |
+
---
|
21 |
+
|
22 |
+
iva-codeint-kotlin-small GPT-2 is (small version - 239.4M parameters) trained from scratch to obtain results in the text-to-code task tailored for Kotlin language used
|
23 |
+
in native mobile development (Android).
|
24 |
+
|
25 |
+
## Usage
|
26 |
+
|
27 |
+
```Python
|
28 |
+
from transformers import pipeline
|
29 |
+
|
30 |
+
pipe = pipeline("text-generation", model="mvasiliniuc/iva-codeint-kotlin-small")
|
31 |
+
outputs = pipe("fun printToConsole()")
|
32 |
+
|
33 |
+
```
|
34 |
+
|
35 |
+
### Inference
|
36 |
+
```Python
|
37 |
+
API_URL = "https://api-inference.huggingface.co/models/mvasiliniuc/iva-codeint-kotlin-small"
|
38 |
+
headers = {"Authorization": "Bearer <key>"}
|
39 |
+
def query(payload):
|
40 |
+
response = requests.post(API_URL, headers=headers, json=payload)
|
41 |
+
return response.json()
|
42 |
+
|
43 |
+
output = query({
|
44 |
+
"inputs": """
|
45 |
+
/**
|
46 |
+
* A public function that returns the current version of the operating system.
|
47 |
+
*/
|
48 |
+
"""
|
49 |
+
})
|
50 |
+
pprint.pprint(output, compact=True)
|
51 |
+
```
|
52 |
+
|
53 |
+
## Training
|
54 |
+
|
55 |
+
| Config | Value |
|
56 |
+
|------|------------------|
|
57 |
+
| seq length | 1024 |
|
58 |
+
| weight decay | 0.1 |
|
59 |
+
| learning rate | 0.0005 |
|
60 |
+
| max eval steps | -1 |
|
61 |
+
| shuffle buffer | 10000 |
|
62 |
+
| max train steps | 150000 |
|
63 |
+
| mixed precision | fp16 |
|
64 |
+
| num warmup steps | 2000 |
|
65 |
+
| train batch size | 5 |
|
66 |
+
| valid batch size | 5 |
|
67 |
+
| lr scheduler type | cosine |
|
68 |
+
| save checkpoint steps | 15000 |
|
69 |
+
| gradient checkpointing | false |
|
70 |
+
| gradient accumulation steps | 1 |
|
71 |
+
|
72 |
+
## Resources
|
73 |
+
|
74 |
+
Resources used for research:
|
75 |
+
* [Training a causal language model from scratch](https://huggingface.co/learn/nlp-course/chapter7/6)
|
76 |
+
* [CodeParrot a GPT-2 model (1.5B parameters) trained to generate Python code](https://huggingface.co/codeparrot/codeparrot)
|