execute in the video demo

#13

by AcceleratedNpc - opened Apr 12

Apr 12

Glad to see such an amazing work!
Wanna know whether the execute script in video demo will be released.
Some other detailed code is also expected.
Will you release more example code for use on Android device. If so, when?
Thanks again for your wonderful work.

AcceleratedNpc

Apr 12

I have tried the example provided by the team and got a result with a latency 20.7s.
Wanna know the latency examed on your device so that the accerated model inference mentioned in paper can be felt directly.
Looking forward to your reply.

alexchen4ai

Nexa AI org Apr 13

Language model has many optimizations like KV cache, quantization, model pruning, specific memory access pattern, etc... Try to use these tricks

Or, you can join our waitlist: https://www.nexa4ai.com/contact, and we will give solutions for the above.

zackli4ai

Nexa AI org May 6

•

edited May 6

@AcceleratedNpc Which GPU are you using? Also, please implement early stopping criteria. Once you observe , the inference can stop

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment