About to help with this model to application maintenance

#5
by joluinfante - opened

I'm trying to simplify the maintenance of a c/c++ application from a number of sources, assuming that Polycoder is already trained in c and c++, by re-training it with my application logic. I understand that it is not simply "make the sources available and let the AI ​​do magic", but, as I read, I also need to inform you about the relationship between all functions (function names, input and output parameters) using AST or similar. Since that will require a lot of processing power, I want to try a simple program, with a few functions. Obviously, I know little about this, I'm trying to learn. I understand that I have to use the tokenizer to adapt this information to what is required by the model, and then generate the model. And, finally, through a prompt, try to establish what things I can ask him to respond to, based on my training. I mean, I don't want magic. Just an iterative process where you learn how to train, and how to ask. Could you give me a simple example of how to start with this, in python? That is, I have seen that you can be asked to produce code, but starting from general training. I intend to re-train (or expand knowledge) from a simple application.

Continue reading about this. It would seem that I have to create a dataset with my example source code. But, I don't want to make a new dataset (because if so, I would have to teach it about c and c++). I want to add knowledge to an existing dataset (in this case, the Polycoder one). Or simply by loading many sources in c and c++ would I be producing a dataset that allows the AI ​​to learn to code in c and c++ from my source code?

Sign up or log in to comment