Time dependence on output length is not linear ... that's strange
#3
by
BoccheseGiacomo
- opened
I was trying the code with various output token length, i tried with 10 , 20,40, 80 tokens, and the time dependence is non linear:
2,5,14,40 seconds respectively.
the dependence is not linear and this is strange since the charateristic of mamba is having linear time complexity and memory complexity depending on token length