KrisPi commited on
Commit
c7f00ea
1 Parent(s): 6fa21c8

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +9 -0
README.md CHANGED
@@ -16,6 +16,11 @@ To prove the point I'm planning to create a few more finetunes like this, starti
16
  5 epochs, LR=1e-05, batch=2, gradient accumulation 32 (i.e. trying to simulate batch 64), max_len=1024. Rank and Alpha both 128 targeting all modules. trained in bfloat16. Constant schedule, no warm-up.
17
  Flash-Attention 2 turned off due to an issue with batching
18
 
 
 
 
 
 
19
  Evals:
20
  HumanEval score (2.4 p.p improvement to best Phind v2 score!) for the new prompt:
21
  **{'pass@1': 0.7621951219512195}**
@@ -35,6 +40,10 @@ After several HumanEval tests and prompts Phind v2 was maximum able to score: 73
35
  In the long term, I'm planning on experimenting with LIMA + DPO Fine-Tuning, but so far I noticed that LIMA datasets need to be both general and task-specific. The best result I got with around 30% of samples that were task specific.
36
  https://huggingface.co/datasets/KrisPi/PythonTutor-Evol-1k-DPO-GPT4_vs_35
37
 
 
 
 
 
38
  r=128,
39
  lora_alpha=128,
40
  target_modules=['q_proj','k_proj','v_proj','o_proj','gate_proj','down_proj','up_proj'],
 
16
  5 epochs, LR=1e-05, batch=2, gradient accumulation 32 (i.e. trying to simulate batch 64), max_len=1024. Rank and Alpha both 128 targeting all modules. trained in bfloat16. Constant schedule, no warm-up.
17
  Flash-Attention 2 turned off due to an issue with batching
18
 
19
+ Expected result:
20
+ New system prompt that will preference for using docstring under each function, use multiple functions even if it doesn't make sense, and comment on every line of the code, it should also greatly reduce explanations before and after code block.
21
+ As a result model will improve readability by Junior Python Developers and additionally do step-by-step reasoning by default to improve code & HumanEval results.
22
+
23
+
24
  Evals:
25
  HumanEval score (2.4 p.p improvement to best Phind v2 score!) for the new prompt:
26
  **{'pass@1': 0.7621951219512195}**
 
40
  In the long term, I'm planning on experimenting with LIMA + DPO Fine-Tuning, but so far I noticed that LIMA datasets need to be both general and task-specific. The best result I got with around 30% of samples that were task specific.
41
  https://huggingface.co/datasets/KrisPi/PythonTutor-Evol-1k-DPO-GPT4_vs_35
42
 
43
+ ```
44
+ ### System Prompt\nYou are an intelligent assistant.\n\n### User Message\nTake a deep breath and think step by step, make sure to verify your solution will pass example test cases. Write in the most simple manner using mutiple functions, simple loops and if statements, do not compress code, the code will be read by other developer.\n{PROMPT}\n\n### Assistant\n
45
+ ```
46
+
47
  r=128,
48
  lora_alpha=128,
49
  target_modules=['q_proj','k_proj','v_proj','o_proj','gate_proj','down_proj','up_proj'],