Difference between starcoderbase and starcoder
Hi,
I have been trying to autocomplete a partial code using starcoder but I guess it is further fine-tuned only on python, so everything in the generation is python
so I tried starcoderbase for a language called Verilog, it never completes anything, an I missing anything?
@shailja Have you tried to prompt the model with the language name, or the maybe required language extension?
@SivilTaram
Can you give an example of an input that would include the language name or extension? What would this look like in input string that is tokenized and passed to generate()
?
NM, I found what I believe is the answer from the starcoder model card page, fill in FILENAME
below:
<reponame>REPONAME<filename>FILENAME<gh_stars>STARS
code<|endoftext|>
@shailja - I see that Verilog and variants of it are in the list of programming languages that StaCoderBase is traiend on. List of programming languages here: https://huggingface.co/datasets/bigcode/the-stack/blob/main/programming-languages.json
So It should be able to do code completion given token with the appropriate file extension as shown above by @spew .