Problems with temperature when using with python code.

#6
by matchaslime - opened

Hi, I am following the instructions to use in python code, but the model is always outputting the same response to the same prompt. Changing the temperature does not seem to do anything. What could be the issue here?

could you provide some code?

I'm just using the example code in the readme for the autogptq section

When I have tested inference before, I have had code to change the seed, like so:

    @property
    def seed(self):
        return self._current_seed

    

@seed

	.setter
    def seed(self, seed):
        self._seed = int(seed)

    def update_seed(self):
        self._current_seed = (self._seed == -1 ) and random.randint(1, 2**31) or self._seed
        random.seed(self._current_seed)
        torch.manual_seed(self._current_seed)
        torch.cuda.manual_seed_all(self._current_seed)

    def generate(self, prompt):
        self.update_seed()
        input_ids, len_input_ids = self.encode(prompt)

        with self.do_timing(True) as timing:
            with torch.no_grad():
                tokens = self.model.generate(inputs=input_ids, generation_config=self.generation_config)[0].cuda()
            len_reply = len(tokens) - len_input_ids
            response = self.tokenizer.decode(tokens)
            reply_tokens = tokens[-len_reply:]
            reply = self.tokenizer.decode(reply_tokens)

        result = {
            'response': response,   # The response in full, including prompt
            'reply': reply,         # Just the reply, no prompt
            'len_reply': len_reply, # The length of the reply tokens
            'seed': self.seed,      # The seed used to generate this response
            'time': timing['time']  # The time in seconds to generate the response
        }
        return result

You could try the same to get a different seed for each generation.

Sign up or log in to comment