syzymon commited on
Commit
67ca5cc
·
1 Parent(s): a2b2bbf

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +82 -0
README.md CHANGED
@@ -1,3 +1,85 @@
1
  ---
2
  license: llama2
3
  ---
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
  ---
2
  license: llama2
3
  ---
4
+ # LongLLaMA-Code 7B Instruct
5
+
6
+
7
+ <div align="center">
8
+
9
+ <table>
10
+ <tr>
11
+ <th style="font-size: 120%"> >_ 🎓 <a href="https://huggingface.co/syzymon/long_llama_code_7b_instruct">LongLLaMA-Code 7B Instruct</a> 📑🗨 </th>
12
+ </tr>
13
+ <tr>
14
+ <td align="center">
15
+ <a href="https://colab.research.google.com/github/CStanKonrad/long_llama/blob/main/long_llama_code_instruct_colab.ipynb"><img src="https://colab.research.google.com/assets/colab-badge.svg"></a>
16
+ </td>
17
+
18
+ </tr>
19
+ </table>
20
+
21
+ </div>
22
+
23
+
24
+ ## TLDR
25
+ [LongLLaMA-Code 7B Instruct](https://huggingface.co/syzymon/long_llama_code_7b_instruct) is [LongLLaMA-Code 7B](https://huggingface.co/syzymon/long_llama_code_7b) tuned on [TIGER-Lab/MathInstruct](https://huggingface.co/datasets/TIGER-Lab/MathInstruct), [OpenOrca](https://huggingface.co/datasets/Open-Orca/OpenOrca), and [ShareGPT-Processed](https://huggingface.co/datasets/zetavg/ShareGPT-Processed) datasets. It can answer basic questions about research papers and code. It can also perform a simple code refactoring. You can try the quantized version of the model using a free GPU in [Google Colab](https://colab.research.google.com/github/CStanKonrad/long_llama/blob/main/long_llama_code_instruct_colab.ipynb).
26
+
27
+ ## Tuning
28
+
29
+ ### Code
30
+ The model was tuned on a TPU v3-128 pod with 128 batch size.
31
+ For tuning, we have used the data preparation pipeline available in instruction_fine_tuning.
32
+ However, we have replaced the Hugging Face Trainer with a modification of FoT continued pretraining code. This modification boils down to propagating the memory cache throughout the model (basically reproducing the Pytorch inference code functionality in JAX).
33
+
34
+ ### Training
35
+ Here, we present the basic information about how the model was tuned. For more details, see the [GitHub repo](https://github.com/CStanKonrad/long_llama/tree/main/instruction_fine_tuning/misc).
36
+
37
+
38
+ All inputs were truncated and randomly padded (left/right) to 3072 tokens.
39
+ The last context length was set to 1536.
40
+ The model was trained for 9k steps, started with a learning rate of 1.2e-5, 700 steps of warmup, and finished with a learning rate of 0.
41
+ The optimizer was adamw.
42
+
43
+ The question prompt (`pre_question_text`) was:
44
+ ```
45
+ You are an AI assistant. User will you give you a task. Your goal is to complete the task as faithfully as you can.\n\n
46
+ ```
47
+
48
+ To trigger the model answer one can use:
49
+ ```
50
+ \nAnswer:
51
+ ```
52
+
53
+ The chat prompt was:
54
+ ```
55
+ A chat between a user (denoted as USER:) and an artificial intelligence assistant (denoted as ASSISTANT:). The assistant gives helpful, detailed, and polite answers to the user's questions.\n\n
56
+ ```
57
+
58
+ To denote the assistant one can write:
59
+ ```
60
+ \nASSISTANT:
61
+ ```
62
+
63
+ To denote the user one can write:
64
+ ```
65
+ \nUSER:
66
+ ```
67
+
68
+ ### Datasets and sampling probability
69
+ * 0.71 - [TIGER-Lab/MathInstruct](https://huggingface.co/datasets/TIGER-Lab/MathInstruct)
70
+ * 0.16, - [Open-Orca/OpenOrca](https://huggingface.co/datasets/Open-Orca/OpenOrca) questions with less than 5k chars
71
+ * 0.08, - [Open-Orca/OpenOrca](https://huggingface.co/datasets/Open-Orca/OpenOrca) questions above 5k chars but below 12k chars
72
+ * 0.02 - [zetavg/ShareGPT-Processed](https://huggingface.co/datasets/zetavg/ShareGPT-Processed) conversations below 6k chars
73
+ * 0.01 - [zetavg/ShareGPT-Processed](https://huggingface.co/datasets/zetavg/ShareGPT-Processed) conversations above 6k chars but below 12k chars
74
+
75
+ To improve the quality of the data, the datasets were filtered using regular expressions.
76
+
77
+
78
+
79
+ ## License
80
+ The instruction/chat-tuned models are for research purposes only.
81
+ [LongLLaMA-Code 7B Instruct](https://huggingface.co/syzymon/long_llama_code_7b_instruct) is [LongLLaMA-Code 7B](https://huggingface.co/syzymon/long_llama_code_7b) tuned on [TIGER-Lab/MathInstruct](https://huggingface.co/datasets/TIGER-Lab/MathInstruct), [OpenOrca](https://huggingface.co/datasets/Open-Orca/OpenOrca), and [ShareGPT-Processed](https://huggingface.co/datasets/zetavg/ShareGPT-Processed) datasets. Note that those datasets contain outputs from ChatGPT. See also the [codellama/CodeLlama-7b-hf](https://huggingface.co/codellama/CodeLlama-7b-hf) license.
82
+
83
+ ## Acknowledgements
84
+ We gratefully acknowledge the TPU Research Cloud program, which was instrumental to our research by providing significant computational resources.
85
+