Prompting of different code languages?

#40
by TAO12138 - opened

Taking quick_sort as an example, what is the prompting of different code languages during the inference.

For python, starcoder can directly generate expected output using # quick sort. But, it went wrong when using other languages. How we can recognize the requirements from different code languages.

Is the prompting like // language: c++\n and # language: Python\n. In this condition, the test results is satisfied.

BigCode org

We did condition on filename during pretraining so you can try appending: <filename>file_path.ext\n where ext is the extension of the language you want to generate the code in, you can change the filepath as you want.
For example we found <filename>solutions/solution_1.py\n# Here is the correct implementation of the code exercise\n) to help with solving HumanEval problems in Python.

We did condition on filename during pretraining so you can try appending: <filename>file_path.ext\n where ext is the extension of the language you want to generate the code in, you can change the filepath as you want.
For example we found <filename>solutions/solution_1.py\n# Here is the correct implementation of the code exercise\n) to help with solving HumanEval problems in Python.

Thank you! But in FIM mode, should I add that before or after <fim_prefix>?
Like

<filename>solutions/solution_1.py
<fim_prefix>...<fim_suffix>...

or

<fim_prefix><filename>solutions/solution_1.py
...<fim_suffix>...
BigCode org

This should be part of the prefix so the second option is the correct one, you can check this code that we use for FIM evaluation.

loubnabnl changed discussion status to closed

Sign up or log in to comment