mvasiliniuc commited on
Commit
e5c691d
1 Parent(s): 266c594

Add initial model datacard.

Browse files
Files changed (1) hide show
  1. README.md +76 -0
README.md ADDED
@@ -0,0 +1,76 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ datasets:
3
+ - mvasiliniuc/iva-kotlin-codeint-clean-train
4
+ - mvasiliniuc/iva-kotlin-codeint-clean-valid
5
+ language:
6
+ - code
7
+ tags:
8
+ - gpt2
9
+ - code
10
+ - kotlin
11
+ - mobile
12
+ - generation
13
+ widget:
14
+ - text: "/**\n\t* A function that returns the version of the current operating system.\n*/\n"
15
+ example_title: "Get current device operating system"
16
+ - text: "/**\n\t* A function that returns the current TimeZone.\n*/\n"
17
+ example_title: "Get current timezone"
18
+ - text: "/**\n\t* A data class representing a Bank Account.\n*/\n"
19
+ example_title: "Data Class - BankAccount"
20
+ ---
21
+
22
+ iva-codeint-kotlin-small GPT-2 is (small version - 239.4M parameters) trained from scratch to obtain results in the text-to-code task tailored for Kotlin language used
23
+ in native mobile development (Android).
24
+
25
+ ## Usage
26
+
27
+ ```Python
28
+ from transformers import pipeline
29
+
30
+ pipe = pipeline("text-generation", model="mvasiliniuc/iva-codeint-kotlin-small")
31
+ outputs = pipe("fun printToConsole()")
32
+
33
+ ```
34
+
35
+ ### Inference
36
+ ```Python
37
+ API_URL = "https://api-inference.huggingface.co/models/mvasiliniuc/iva-codeint-kotlin-small"
38
+ headers = {"Authorization": "Bearer <key>"}
39
+ def query(payload):
40
+ response = requests.post(API_URL, headers=headers, json=payload)
41
+ return response.json()
42
+
43
+ output = query({
44
+ "inputs": """
45
+ /**
46
+ * A public function that returns the current version of the operating system.
47
+ */
48
+ """
49
+ })
50
+ pprint.pprint(output, compact=True)
51
+ ```
52
+
53
+ ## Training
54
+
55
+ | Config | Value |
56
+ |------|------------------|
57
+ | seq length | 1024 |
58
+ | weight decay | 0.1 |
59
+ | learning rate | 0.0005 |
60
+ | max eval steps | -1 |
61
+ | shuffle buffer | 10000 |
62
+ | max train steps | 150000 |
63
+ | mixed precision | fp16 |
64
+ | num warmup steps | 2000 |
65
+ | train batch size | 5 |
66
+ | valid batch size | 5 |
67
+ | lr scheduler type | cosine |
68
+ | save checkpoint steps | 15000 |
69
+ | gradient checkpointing | false |
70
+ | gradient accumulation steps | 1 |
71
+
72
+ ## Resources
73
+
74
+ Resources used for research:
75
+ * [Training a causal language model from scratch](https://huggingface.co/learn/nlp-course/chapter7/6)
76
+ * [CodeParrot a GPT-2 model (1.5B parameters) trained to generate Python code](https://huggingface.co/codeparrot/codeparrot)