Text Generation
Transformers
PyTorch
code
gpt2
custom_code
Eval Results
text-generation-inference

SantaCoder Hyperparameter Insights

#30
by nandovallec - opened

Hello everyone,
I'm currently in the process of doing my master's thesis about LLM applied to code. In my experiments, I am comparing the code summarization and code synthetization tasks in Java for multiple models. Since one of the models I will be using is InCoder, I thought the use of SantaCoder could be interesting to further prove the claim that they are comparable.

I would like to ask if you have some advice on the use of SantaCoder for these tasks. In the sense of hyperparameters or prompt engineering. I checked the SantaCoder demo where I realized that you set a low temperature for the infill tasks but not for the code generation one. Therefore, I was wondering if you have any other insights that may improve the performance of the model.

Best regards.

Sign up or log in to comment