Faster inference

#1
by soujanyaporia - opened

This is a very good work! Thanks so much.

  1. Could we have the option to specify how many samples the user wants to generate instead of always generating a fixed number of samples per prompt?
  2. Could we make the inference faster by using a different scheduler that does not need 100 steps for inference, and by using flash attention?

Thank you!

Sign up or log in to comment