Petr Tsvetkov commited on
Commit
574fdf5
β€’
1 Parent(s): 30e165f

Add some examples to the synthetic ds generation prompt

Browse files
Files changed (1) hide show
  1. generate_synthetic_dataset.py +35 -1
generate_synthetic_dataset.py CHANGED
@@ -15,6 +15,35 @@ client = GrazieApiGatewayClient(
15
  )
16
 
17
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
18
  def build_prompt(reference, diff):
19
  return f"""A software developer uses a LLM to generate commit messages.
20
 
@@ -29,7 +58,12 @@ START OF THE COMMIT MESSAGE
29
  {reference}
30
  END OF THE COMMIT MESSAGE
31
 
32
- Your task is to print the initial, LLM-generated commit message. Print only the initial commit message's text after the
 
 
 
 
 
33
  token "OUTPUT".
34
 
35
  OUTPUT"""
 
15
  )
16
 
17
 
18
+ def get_example_prompt(start_msg, end_msg):
19
+ return f"""START OF THE EXAMPLE
20
+
21
+ For following the edited message:
22
+ START OF THE EDITED COMMIT MESSAGE
23
+ {end_msg}
24
+ END OF THE EDITED COMMIT MESSAGE
25
+
26
+ You would output the following initial commit message:
27
+ START OF THE INITIAL COMMIT MESSAGE
28
+ {start_msg}
29
+ END OF THE INITIAL COMMIT MESSAGE
30
+
31
+ END OF THE EXAMPLE"""
32
+
33
+
34
+ def generate_examples():
35
+ manual_df = hf_data_loader.load_raw_rewriting_dataset_as_pandas()[['commit_msg_start', 'commit_msg_end']]
36
+ examples = [
37
+ get_example_prompt(row['commit_msg_start'], row['commit_msg_end'])
38
+ for _, row in manual_df.iterrows()
39
+ ]
40
+
41
+ return "\n".join(examples)
42
+
43
+
44
+ EXAMPLES = generate_examples()
45
+
46
+
47
  def build_prompt(reference, diff):
48
  return f"""A software developer uses a LLM to generate commit messages.
49
 
 
58
  {reference}
59
  END OF THE COMMIT MESSAGE
60
 
61
+ Your task is to print the initial, LLM-generated commit message. Here are some examples of what you should output:
62
+ START OF THE EXAMPLES LIST
63
+ {EXAMPLES}
64
+ END OF THE EXAMPLES LIST
65
+
66
+ Print only the initial commit message's text after the
67
  token "OUTPUT".
68
 
69
  OUTPUT"""