Update README.md
Browse files
README.md
CHANGED
@@ -19,10 +19,10 @@ tags:
|
|
19 |
|
20 |
# Kexer models
|
21 |
|
22 |
-
Kexer models
|
23 |
-
This is a repository for fine-tuned CodeLlama-7b model in the Hugging Face Transformers format.
|
24 |
|
25 |
-
#
|
26 |
|
27 |
```python
|
28 |
from transformers import AutoModelForCausalLM, AutoTokenizer
|
@@ -59,7 +59,7 @@ As with the base model, we can use FIM. To do this, the following format must be
|
|
59 |
|
60 |
# Training setup
|
61 |
|
62 |
-
The model was trained on one A100 GPU with following hyperparameters:
|
63 |
|
64 |
| **Hyperparameter** | **Value** |
|
65 |
|:---------------------------:|:----------------------------------------:|
|
@@ -69,24 +69,23 @@ The model was trained on one A100 GPU with following hyperparameters:
|
|
69 |
| `total_batch_size` | 256 (~130K tokens per step) |
|
70 |
| `num_epochs` | 4 |
|
71 |
|
72 |
-
More details about
|
73 |
|
74 |
# Fine-tuning data
|
75 |
|
76 |
-
For this model we used 15K exmaples
|
77 |
-
For more information about the dataset follow the link.
|
78 |
|
79 |
# Evaluation
|
80 |
|
81 |
-
|
82 |
|
83 |
-
|
84 |
|
85 |
| **Model name** | **Kotlin HumanEval Pass Rate** |
|
86 |
|:---------------------------:|:----------------------------------------:|
|
87 |
-
| `
|
88 |
-
| `
|
89 |
|
90 |
-
# Ethical
|
91 |
|
92 |
-
CodeLlama-7B-Kexer
|
|
|
19 |
|
20 |
# Kexer models
|
21 |
|
22 |
+
Kexer models are a collection of open-source generative text models fine-tuned on the [Kotlin Exercices](https://huggingface.co/datasets/JetBrains/KExercises) dataset.
|
23 |
+
This is a repository for the fine-tuned **CodeLlama-7b** model in the *Hugging Face Transformers* format.
|
24 |
|
25 |
+
# How to use
|
26 |
|
27 |
```python
|
28 |
from transformers import AutoModelForCausalLM, AutoTokenizer
|
|
|
59 |
|
60 |
# Training setup
|
61 |
|
62 |
+
The model was trained on one A100 GPU with the following hyperparameters:
|
63 |
|
64 |
| **Hyperparameter** | **Value** |
|
65 |
|:---------------------------:|:----------------------------------------:|
|
|
|
69 |
| `total_batch_size` | 256 (~130K tokens per step) |
|
70 |
| `num_epochs` | 4 |
|
71 |
|
72 |
+
More details about fine-tuning can be found in the technical report.
|
73 |
|
74 |
# Fine-tuning data
|
75 |
|
76 |
+
For tuning this model, we used 15K exmaples from the synthetically generated [Kotlin Exercices dataset](https://huggingface.co/datasets/JetBrains/KExercises). Every example follows the HumanEval format. In total, the dataset contains about 3.5M tokens.
|
|
|
77 |
|
78 |
# Evaluation
|
79 |
|
80 |
+
For evaluation, we used the [Kotlin HumanEval](https://huggingface.co/datasets/JetBrains/Kotlin_HumanEval) dataset, which contains all 161 tasks from HumanEval translated into Kotlin by human experts. You can find more details about the pre-processing necessary to obtain our results, including the code for running, on the [datasets's page](https://huggingface.co/datasets/JetBrains/Kotlin_HumanEval).
|
81 |
|
82 |
+
Here are the results of our evaluation:
|
83 |
|
84 |
| **Model name** | **Kotlin HumanEval Pass Rate** |
|
85 |
|:---------------------------:|:----------------------------------------:|
|
86 |
+
| `CodeLlama-7B` | 26.89 |
|
87 |
+
| `CodeLlama-7B-Kexer` | **42.24** |
|
88 |
|
89 |
+
# Ethical considerations and limitations
|
90 |
|
91 |
+
CodeLlama-7B-Kexer is a new technology that carries risks with use. The testing conducted to date has not covered, nor could it cover all scenarios. For these reasons, as with all LLMs, CodeLlama-7B-Kexer's potential outputs cannot be predicted in advance, and the model may in some instances produce inaccurate or objectionable responses to user prompts. The model was fine-tuned on a specific data format (Kotlin tasks), and deviation from this format can also lead to inaccurate or undesirable responses to user queries. Therefore, before deploying any applications of CodeLlama-7B-Kexer, developers should perform safety testing and tuning tailored to their specific applications of the model.
|