Time dependence on output length is not linear ... that's strange

by BoccheseGiacomo - opened

I was trying the code with various output token length, i tried with 10 , 20,40, 80 tokens, and the time dependence is non linear:
2,5,14,40 seconds respectively.

the dependence is not linear and this is strange since the charateristic of mamba is having linear time complexity and memory complexity depending on token length

Sign up or log in to comment