The memory has not been released after the first inference.
I guess it's the model weights on GPU.
· Sign up or log in to comment