Tensor parallel version of this for efficient inference?

#5
by mayank-mishra - opened

Does a TP version exist?

mayank-mishra changed discussion status to closed

Sign up or log in to comment