Meta Llama 3 8B โ Core ML
This repository contains a Core ML conversion of meta-llama/Meta-Llama-3-8B
This does not have KV Cache. only: inputs int32 / outputs float16.
I haven't been able to test this, so leave something in 'Community' to let me know how ya tested it and how it worked.
I did model.half() before scripting / coverting thinking it would reduce my memory usage (I found online that it doesn't).
I am unsure if it affected the conversion process or not.
- Downloads last month
- 5