Difference between starcoderbase and starcoder

#10
by shailja - opened

Hi,

I have been trying to autocomplete a partial code using starcoder but I guess it is further fine-tuned only on python, so everything in the generation is python

so I tried starcoderbase for a language called Verilog, it never completes anything, an I missing anything?

BigCode org

@shailja Have you tried to prompt the model with the language name, or the maybe required language extension?

@SivilTaram Can you give an example of an input that would include the language name or extension? What would this look like in input string that is tokenized and passed to generate()?

NM, I found what I believe is the answer from the starcoder model card page, fill in FILENAME below:

<reponame>REPONAME<filename>FILENAME<gh_stars>STARS
code<|endoftext|>

@shailja - I see that Verilog and variants of it are in the list of programming languages that StaCoderBase is traiend on. List of programming languages here: https://huggingface.co/datasets/bigcode/the-stack/blob/main/programming-languages.json

So It should be able to do code completion given token with the appropriate file extension as shown above by @spew .

Sign up or log in to comment