Issue with using the codellama-7b model

#17
by RyanAX - opened

I have set up the codellama-7b model locally and used the official example, but the final result does not meet expectations. Here is the code:

codeLlama_tokenizer = CodeLlamaTokenizer.from_pretrained("./CodeLlama-7b-hf", padding_side='left')
codeLlama_model = LlamaForCausalLM.from_pretrained("./CodeLlama-7b-hf")
codeLlama_model.to(device='cuda:0', dtype=torch.bfloat16)

text = '''def remove_non_ascii(s: str) -> str:
        """ <FILL_ME>
        return result
    '''

start_time = time.time()
input_ids = codeLlama_tokenizer(text, return_tensors="pt")["input_ids"]
input_ids = input_ids.to('cuda')
generated_ids = codeLlama_model.generate(input_ids, max_new_tokens=200, do_sample=True, top_p=0.9, temperature=0.1, num_return_sequences=1, repetition_penalty=1.05, eos_token_id=tokenizer.eos_token_id, pad_token_id=tokenizer.pad_token_id)
filling = codeLlama_tokenizer.batch_decode(generated_ids[:, input_ids.shape[1]:], skip_special_tokens=True)[0]
print(filling)

The output of the code is:

Remove non-ascii characters from a string. """
        result = ""
        for c in s:
            if ord(c) < 128:
                result += c
    }

    public void setId(String id) {
        this.id = id;
    }

    public String getName() {
        return name;
    }

    public void setName(String name) {
        this.name = name;
    }

    public String getDescription() {
        return description;
    }

    public void setDescription(String description) {
        this.description = description;
    }

    public String getType() {
        return type;
}

There are two issues with the generated code that don't meet expectations:
1、It doesn't consider suffixes and seems to ignore everything after <FILL_ME>.
2、After completing the desired part of the code, it adds a lot of unnecessary additional code.

Is this behavior normal? Is there any way to improve it?

i have the same questions

Code Llama org

A few things to note here.

  1. to check if the <FILL_ME> is taken into account, you need to make sure the input ids are properly formatted.
  2. the outputs we have match 1-1 with the original outputs. But when you generate with sampling and custom temperature etc, you should expect some hallucination. Especially if the eos token is not properly set, the model will not stop early enough :/

Thanks for opening the issue!

Regarding the unnecessary additional code, In my case it was helpful to use a repetition penalty of 0.9. Maybe that helps in your case as well! :)

Sign up or log in to comment