JetBrains
/

CodeLlama-7B-Kexer

Text Generation

text-generation-inference

Inference Endpoints

Model card Files Files and versions Community

CodeLlama-7B-Kexer / README.md

jdev8's picture

Update README.md

76d7a33 verified 6 months ago

|

2.23 kB

	---
	license: apache-2.0
	---

	# Kexer models

	Kexer models is a collection of fine-tuned open-source generative text models fine-tuned on Kotlin Exercices dataset.
	This is a repository for fine-tuned CodeLlama-7b model in the Hugging Face Transformers format.

	# Model use

	```python
	from transformers import AutoModelForCausalLM, AutoTokenizer

	# Load pre-trained model and tokenizer
	model_name = 'JetBrains/CodeLlama-7B-Kexer'
	tokenizer = AutoTokenizer.from_pretrained(model_name)
	model = AutoModelForCausalLM.from_pretrained(model_name).to('cuda')

	# Create and encode input
	input_text = """\
	This function takes an integer n and returns factorial of a number:
	fun factorial(n: Int): Int {\
	"""
	input_ids = tokenizer.encode(
	input_text, return_tensors='pt'
	).to('cuda')

	# Generate
	output = model.generate(
	input_ids, max_length=150, num_return_sequences=1,
	no_repeat_ngram_size=2, early_stopping=True
	)

	# Decode output
	generated_text = tokenizer.decode(output[0], skip_special_tokens=True)
	print(generated_text)
	```

	# Training setup

	The model was trained on one A100 GPU with following hyperparameters:

	\| Hyperparameter \| Value \|
	\|:---------------------------:\|:----------------------------------------:\|
	\| `warmup` \| 10% \|
	\| `max_lr` \| 1e-4 \|
	\| `scheduler` \| linear \|
	\| `total_batch_size` \| 256 (~130K tokens per step) \|


	# Fine-tuning data

	For this model we used 15K exmaples of Kotlin Exercices dataset {TODO: link!}. For more information about the dataset follow th link.

	# Evaluation

	To evaluate we used Kotlin Humaneval (more infromation here)

	Fine-tuned model:

	\| Model name \| Kotlin HumanEval Pass Rate \| Kotlin Completion \|
	\|:---------------------------:\|:----------------------------------------:\|:----------------------------------------:\|
	\| `base model` \| 26.89 \| 0.388 \|
	\| `fine-tuned model` \| 42.24 \| 0.344 \|