Require Pytorch version

#14

by tlphams - opened Jan 26, 2023

Jan 26, 2023

I have tried to run the model with Pytorch=1.9.0+cu111, but its generated text is bizarre with duplicated words. So I want to know about the requirement of torch version and other libraries. Thank you.

tlphams changed discussion status to closed Jan 26, 2023

loubnabnl

BigCode org Jan 26, 2023

•

edited Jan 26, 2023

Can you please share the code you used to generate text, Pytorch version shouldn't impact the generation. Something to pay attention to is not passing token_type_ids returned by the tokenizer to the model. Here's a working example to use the model both in standard and FIM settings:

from transformers import AutoModelForCausalLM, AutoTokenizer

model = AutoModelForCausalLM.from_pretrained("bigcode/santacoder", trust_remote_code=True)
tokenizer = AutoTokenizer.from_pretrained("bigcode/santacoder")

#standard example
input_text ="def all_odd_elements((L):\n"
# example to do FIM, add fim special tokens: <fim-prefix>, <fim-middle> and <fim-suffix> 
input_text_fim = "<fim-prefix>def fib(n):<fim-suffix>    else:\n        return fib(n - 2) + fib(n - 1)<fim-middle>"

# tokenizer(inputs) returns inputs_ids, attention_mask and token_types_ids, the latter shouldn't be fed to the model
# so if you want to use model(**inputs) or model.generate(**inputs) make sure you add return_token_type_ids=False to not have it returned 

inputs = tokenizer(input_text, return_tensors="pt") # add return_token_type_ids=False for model(**inputs) 
inputs_fim = tokenizer(input_text_fim, return_tensors="pt")  # add return_token_type_ids=False for model(**inputs) 

outputs = model.generate(inputs["input_ids"], max_new_tokens=18)
outputs_fim = model.generate(inputs_fim["input_ids"], max_new_tokens=25)

generation = [tokenizer.decode(tensor, skip_special_tokens=False) for tensor in outputs]
generation_fim = [tokenizer.decode(tensor, skip_special_tokens=False) for tensor in outputs_fim]

print(f"Standard example:\n {generation[0]}")
print(f"FIM example:\n {generation_fim[0]}")

Standard example:
 def all_odd_elements((L):
    return all(x % 2!= 0 for x in L)


FIM example:
 <fim-prefix>def fib(n):<fim-suffix>    else:
        return fib(n - 2) + fib(n - 1)<fim-middle>
    if n == 0:
        return 0
    elif n == 1:
        return 1
<|endoftext|><fim-prefix>

tlphams

Jan 27, 2023

Yeah there is a mistake in my text generation code ^^ I have changed the code and it is working well now
Previously:

inputs = tokenizer(input_text, return_tensors="pt")
outputs = model.generate(**inputs, max_new_tokens=64)

I checked the README.md again and have changed it into

inputs = tokenizer.encode(input_text, return_tensors="pt")
outputs = model.generate(inputs, max_new_tokens=64)

I have just tried to add return_token_type_ids=False in the first case, too, and it also works. Thank you ^^

loubnabnl

BigCode org Jan 28, 2023

Great, you don't even need to specify return_token_type_ids=False now, we turned it off by default

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

Your need to confirm your account before you can post a new comment.

· Sign up or log in to comment