Could I bypass language interpretation by setting up a tokenized prompt directly?

#719
by Odair - opened

I wanted to exploratively figure out what is Dalle-mini lexicon, but my naive attempt to modify a tokenized_prompt has failed:

prompts = ["cat"]
tokenized_prompts = processor(prompts)
print(tokenized_prompts['input_ids'][0])

#Instead of x[idx] = y, use x = x.at[idx].set(y) or another .at[] method
tokenized_prompts['input_ids'][0].at[1].set(1193) #show a dog instead

print(tokenized_prompts['input_ids'][0][1])

result:

[ 0 803 2 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1
1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1
1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1
1 1 1 1 1 1 1 1 1 1]
803 #Why it didn't change?

Oh I've made some progress:

prompts = ["cat"]
tokenized_prompts = processor(prompts)
a = tokenized_prompts['input_ids'][0].at[1].set(1193) #show a dog instead
b = tokenized_prompts['input_ids'].at[0].set(a)
tokenized_prompts['input_ids'] = b

Sign up or log in to comment