Update README.md

by jdev8 - opened May 15

base: refs/heads/main

←

from: refs/pr/3

Discussion Files changed

+63

-187

Files changed (4) hide show

.gitattributes +35 -0
LICENSE +0 -125
NOTICE +0 -1
README.md +28 -61

.gitattributes ADDED Viewed

	@@ -0,0 +1,35 @@

+*.7z filter=lfs diff=lfs merge=lfs -text
+*.arrow filter=lfs diff=lfs merge=lfs -text
+*.bin filter=lfs diff=lfs merge=lfs -text
+*.bz2 filter=lfs diff=lfs merge=lfs -text
+*.ckpt filter=lfs diff=lfs merge=lfs -text
+*.ftz filter=lfs diff=lfs merge=lfs -text
+*.gz filter=lfs diff=lfs merge=lfs -text
+*.h5 filter=lfs diff=lfs merge=lfs -text
+*.joblib filter=lfs diff=lfs merge=lfs -text
+*.lfs.* filter=lfs diff=lfs merge=lfs -text
+*.mlmodel filter=lfs diff=lfs merge=lfs -text
+*.model filter=lfs diff=lfs merge=lfs -text
+*.msgpack filter=lfs diff=lfs merge=lfs -text
+*.npy filter=lfs diff=lfs merge=lfs -text
+*.npz filter=lfs diff=lfs merge=lfs -text
+*.onnx filter=lfs diff=lfs merge=lfs -text
+*.ot filter=lfs diff=lfs merge=lfs -text
+*.parquet filter=lfs diff=lfs merge=lfs -text
+*.pb filter=lfs diff=lfs merge=lfs -text
+*.pickle filter=lfs diff=lfs merge=lfs -text
+*.pkl filter=lfs diff=lfs merge=lfs -text
+*.pt filter=lfs diff=lfs merge=lfs -text
+*.pth filter=lfs diff=lfs merge=lfs -text
+*.rar filter=lfs diff=lfs merge=lfs -text
+*.safetensors filter=lfs diff=lfs merge=lfs -text
+saved_model/**/* filter=lfs diff=lfs merge=lfs -text
+*.tar.* filter=lfs diff=lfs merge=lfs -text
+*.tar filter=lfs diff=lfs merge=lfs -text
+*.tflite filter=lfs diff=lfs merge=lfs -text
+*.tgz filter=lfs diff=lfs merge=lfs -text
+*.wasm filter=lfs diff=lfs merge=lfs -text
+*.xz filter=lfs diff=lfs merge=lfs -text
+*.zip filter=lfs diff=lfs merge=lfs -text
+*.zst filter=lfs diff=lfs merge=lfs -text
+*tfevents* filter=lfs diff=lfs merge=lfs -text

LICENSE DELETED Viewed

@@ -1,125 +0,0 @@
-LLAMA 2 COMMUNITY LICENSE AGREEMENT
-Llama 2 Version Release Date: July 18, 2023
-"Agreement" means the terms and conditions for use, reproduction, distribution and
-modification of the Llama Materials set forth herein.
-"Documentation" means the specifications, manuals and documentation
-accompanying Llama 2 distributed by Meta at ai.meta.com/resources/models-and-
-libraries/llama-downloads/.
-"Licensee" or "you" means you, or your employer or any other person or entity (if
-you are entering into this Agreement on such person or entity's behalf), of the age
-required under applicable laws, rules or regulations to provide legal consent and that
-has legal authority to bind your employer or such other person or entity if you are
-entering in this Agreement on their behalf.
-"Llama 2" means the foundational large language models and software and
-algorithms, including machine-learning model code, trained model weights,
-inference-enabling code, training-enabling code, fine-tuning enabling code and other
-elements of the foregoing distributed by Meta at ai.meta.com/resources/models-and-
-libraries/llama-downloads/.
-"Llama Materials" means, collectively, Meta's proprietary Llama 2 and
-Documentation (and any portion thereof) made available under this Agreement.
-"Meta" or "we" means Meta Platforms Ireland Limited (if you are located in or, if you
-are an entity, your principal place of business is in the EEA or Switzerland) and Meta
-Platforms, Inc. (if you are located outside of the EEA or Switzerland).
-By clicking "I Accept" below or by using or distributing any portion or element of the
-Llama Materials, you agree to be bound by this Agreement.
-1. License Rights and Redistribution.
-      a. Grant of Rights. You are granted a non-exclusive, worldwide, non-
-transferable and royalty-free limited license under Meta's intellectual property or
-other rights owned by Meta embodied in the Llama Materials to use, reproduce,
-distribute, copy, create derivative works of, and make modifications to the Llama
-Materials.
-      b. Redistribution and Use.
-            i. If you distribute or make the Llama Materials, or any derivative works
-thereof, available to a third party, you shall provide a copy of this Agreement to such
-third party.
-            ii.  If you receive Llama Materials, or any derivative works thereof, from
-a Licensee as part of an integrated end user product, then Section 2 of this
-Agreement will not apply to you.
-            iii. You must retain in all copies of the Llama Materials that you
-distribute the following attribution notice within a "Notice" text file distributed as a
-part of such copies: "Llama 2 is licensed under the LLAMA 2 Community License,
-Copyright (c) Meta Platforms, Inc. All Rights Reserved."
-            iv. Your use of the Llama Materials must comply with applicable laws
-and regulations (including trade compliance laws and regulations) and adhere to the
-Acceptable Use Policy for the Llama Materials (available at
-https://ai.meta.com/llama/use-policy), which is hereby incorporated by reference into
-this Agreement.
-            v. You will not use the Llama Materials or any output or results of the
-Llama Materials to improve any other large language model (excluding Llama 2 or
-derivative works thereof).
-2. Additional Commercial Terms. If, on the Llama 2 version release date, the
-monthly active users of the products or services made available by or for Licensee,
-or Licensee's affiliates, is greater than 700 million monthly active users in the
-preceding calendar month, you must request a license from Meta, which Meta may
-grant to you in its sole discretion, and you are not authorized to exercise any of the
-rights under this Agreement unless or until Meta otherwise expressly grants you
-such rights.
-3. Disclaimer of Warranty. UNLESS REQUIRED BY APPLICABLE LAW, THE
-LLAMA MATERIALS AND ANY OUTPUT AND RESULTS THEREFROM ARE
-PROVIDED ON AN "AS IS" BASIS, WITHOUT WARRANTIES OF ANY KIND,
-EITHER EXPRESS OR IMPLIED, INCLUDING, WITHOUT LIMITATION, ANY
-WARRANTIES OF TITLE, NON-INFRINGEMENT, MERCHANTABILITY, OR
-FITNESS FOR A PARTICULAR PURPOSE. YOU ARE SOLELY RESPONSIBLE
-FOR DETERMINING THE APPROPRIATENESS OF USING OR REDISTRIBUTING
-THE LLAMA MATERIALS AND ASSUME ANY RISKS ASSOCIATED WITH YOUR
-USE OF THE LLAMA MATERIALS AND ANY OUTPUT AND RESULTS.
-4. Limitation of Liability. IN NO EVENT WILL META OR ITS AFFILIATES BE
-LIABLE UNDER ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, TORT,
-NEGLIGENCE, PRODUCTS LIABILITY, OR OTHERWISE, ARISING OUT OF THIS
-AGREEMENT, FOR ANY LOST PROFITS OR ANY INDIRECT, SPECIAL,
-CONSEQUENTIAL, INCIDENTAL, EXEMPLARY OR PUNITIVE DAMAGES, EVEN
-IF META OR ITS AFFILIATES HAVE BEEN ADVISED OF THE POSSIBILITY OF
-ANY OF THE FOREGOING.
-5. Intellectual Property.
-      a. No trademark licenses are granted under this Agreement, and in
-connection with the Llama Materials, neither Meta nor Licensee may use any name
-or mark owned by or associated with the other or any of its affiliates, except as
-required for reasonable and customary use in describing and redistributing the
-Llama Materials.
-      b. Subject to Meta's ownership of Llama Materials and derivatives made by or
-for Meta, with respect to any derivative works and modifications of the Llama
-Materials that are made by you, as between you and Meta, you are and will be the
-owner of such derivative works and modifications.
-      c. If you institute litigation or other proceedings against Meta or any entity
-(including a cross-claim or counterclaim in a lawsuit) alleging that the Llama
-Materials or Llama 2 outputs or results, or any portion of any of the foregoing,
-constitutes infringement of intellectual property or other rights owned or licensable
-by you, then any licenses granted to you under this Agreement shall terminate as of
-the date such litigation or claim is filed or instituted. You will indemnify and hold
-harmless Meta from and against any claim by any third party arising out of or related
-to your use or distribution of the Llama Materials.
-6. Term and Termination. The term of this Agreement will commence upon your
-acceptance of this Agreement or access to the Llama Materials and will continue in
-full force and effect until terminated in accordance with the terms and conditions
-herein. Meta may terminate this Agreement if you are in breach of any term or
-condition of this Agreement. Upon termination of this Agreement, you shall delete
-and cease use of the Llama Materials. Sections 3, 4 and 7 shall survive the
-termination of this Agreement.
-7. Governing Law and Jurisdiction. This Agreement will be governed and
-construed under the laws of the State of California without regard to choice of law
-principles, and the UN Convention on Contracts for the International Sale of Goods
-does not apply to this Agreement. The courts of California shall have exclusive
-jurisdiction of any dispute arising out of this Agreement.

NOTICE DELETED Viewed

	@@ -1 +0,0 @@
1	- Llama 2 is licensed under the LLAMA 2 Community License, Copyright (c) Meta Platforms, Inc. All Rights Reserved

README.md CHANGED Viewed

@@ -1,65 +1,38 @@
 ---
 license: apache-2.0
-datasets:
-- JetBrains/KExercises
-base_model: meta-llama/CodeLlama-7b-hf
-results:
-- task:
-    type: text-generation
-  dataset:
-    name: MultiPL-HumanEval (Kotlin)
-    type: openai_humaneval
-  metrics:
-  - name: pass@1
-    type: pass@1
-    value: 42.24
-tags:
-- code
 ---
 # Kexer models
-Kexer models are a collection of open-source generative text models fine-tuned on the [Kotlin Exercices](https://huggingface.co/datasets/JetBrains/KExercises) dataset.
-This is a repository for the fine-tuned **CodeLlama-7b** model in the *Hugging Face Transformers* format.
-# How to use
 ```python
-from transformers import AutoModelForCausalLM, AutoTokenizer
-# Load pre-trained model and tokenizer
-model_name = 'JetBrains/CodeLlama-7B-Kexer'
-tokenizer = AutoTokenizer.from_pretrained(model_name)
-model = AutoModelForCausalLM.from_pretrained(model_name).to('cuda')
-# Create and encode input
-input_text = """\
-This function takes an integer n and returns factorial of a number:
-fun factorial(n: Int): Int {\
-"""
-input_ids = tokenizer.encode(
-    input_text, return_tensors='pt'
-).to('cuda')
-# Generate
-output = model.generate(
-    input_ids, max_length=60, num_return_sequences=1,
-    early_stopping=True, pad_token_id=tokenizer.eos_token_id,
-)
-# Decode output
-generated_text = tokenizer.decode(output[0], skip_special_tokens=True)
-print(generated_text)
-```
-As with the base model, we can use FIM. To do this, the following format must be used:
-```
-'<PRE> ' + prefix + ' <SUF> ' + suffix + ' <MID>'
 ```
 # Training setup
-The model was trained on one A100 GPU with the following hyperparameters:
 |         **Hyperparameter**           |             **Value**              |
 |:---------------------------:|:----------------------------------------:|
@@ -67,25 +40,19 @@ The model was trained on one A100 GPU with the following hyperparameters:
 |        `max_lr`        |          1e-4          |
 |        `scheduler`        |          linear          |
 |        `total_batch_size`        |          256 (~130K tokens per step)          |
-|        `num_epochs`        |          4          |
-More details about fine-tuning can be found in the technical report (coming soon!).
 # Fine-tuning data
-For tuning this model, we used 15K exmaples from the synthetically generated [Kotlin Exercices](https://huggingface.co/datasets/JetBrains/KExercises) dataset. Every example follows the HumanEval format. In total, the dataset contains about 3.5M tokens.
 # Evaluation
-For evaluation, we used the [Kotlin HumanEval](https://huggingface.co/datasets/JetBrains/Kotlin_HumanEval) dataset, which contains all 161 tasks from HumanEval translated into Kotlin by human experts. You can find more details about the pre-processing necessary to obtain our results, including the code for running, on the [datasets's page](https://huggingface.co/datasets/JetBrains/Kotlin_HumanEval).
-Here are the results of our evaluation:
-|         **Model name**           |             **Kotlin HumanEval Pass Rate**              |
-|:---------------------------:|:----------------------------------------:|
-|           `CodeLlama-7B`            |           26.89            |
-|        `CodeLlama-7B-Kexer`        |          **42.24**         |
-# Ethical considerations and limitations
-CodeLlama-7B-Kexer is a new technology that carries risks with use. The testing conducted to date has not covered, nor could it cover all scenarios. For these reasons, as with all LLMs, CodeLlama-7B-Kexer's potential outputs cannot be predicted in advance, and the model may in some instances produce inaccurate or objectionable responses to user prompts. The model was fine-tuned on a specific data format (Kotlin tasks), and deviation from this format can also lead to inaccurate or undesirable responses to user queries. Therefore, before deploying any applications of CodeLlama-7B-Kexer, developers should perform safety testing and tuning tailored to their specific applications of the model.

 ---
 license: apache-2.0
 ---
 # Kexer models
+Kexer models is a collection of fine-tuned open-source generative text models fine-tuned on Kotlin Exercices dataset.
+This is a repository for fine-tuned CodeLlama-7b model in the Hugging Face Transformers format.
+# Model use
 ```python
+  from transformers import AutoModelForCausalLM, AutoTokenizer
+  # Load pre-trained model and tokenizer
+  model_name = 'JetBrains/CodeLlama-7B-Kexer'  # Replace with the desired model name
+  tokenizer = AutoTokenizer.from_pretrained(model_name)
+  model = AutoModelForCausalLM.from_pretrained(model_name).cuda()
+  # Encode input text
+  input_text = """This function takes an integer n and returns factorial of a number:
+  fun factorial(n: Int): Int {"""
+  input_ids = tokenizer.encode(input_text, return_tensors='pt').to('cuda')
+  # Generate text
+  output = model.generate(input_ids, max_length=150, num_return_sequences=1, no_repeat_ngram_size=2, early_stopping=True)
+  # Decode and print the generated text
+  generated_text = tokenizer.decode(output[0], skip_special_tokens=True)
+  print(generated_text)
 ```
 # Training setup
+The model was trained on one A100 GPU with following hyperparameters:
 |         **Hyperparameter**           |             **Value**              |
 |:---------------------------:|:----------------------------------------:|
 |        `max_lr`        |          1e-4          |
 |        `scheduler`        |          linear          |
 |        `total_batch_size`        |          256 (~130K tokens per step)          |
 # Fine-tuning data
+For this model we used 15K exmaples of Kotlin Exercices dataset {TODO: link!}. For more information about the dataset follow th link.
 # Evaluation
+To evaluate we used Kotlin Humaneval (more infromation here)
+Fine-tuned model:
+|         **Model name**           |             **Kotlin HumanEval Pass Rate**              |             **Kotlin Completion**              |
+|:---------------------------:|:----------------------------------------:|:----------------------------------------:|
+|           `base model`            |           26.89            |           0.388            |
+|        `fine-tuned model`        |          42.24         |          0.344          |