How to run on Android mobile phones

#17
by Skku-kjh - opened

Could you provide detailed instructions on how to implement the optimization described in the OctopusV2 technical paper, which involves precomputing the state for a fixed prefix to achieve efficient on-device performance? Specifically, how can we achieve the described performance of completing a function call within 1.1 to 1.7 seconds for typical queries of 20 to 30 tokens using a standard Android phone?

Sign up or log in to comment