anthonymikinka's picture
Update README.md
5d4b0ed verified
metadata
language:
  - en
pipeline_tag: text-generation
tags:
  - facebook
  - meta
  - llama
  - llama-3
  - coreml
  - text-generation
license: llama3

Meta Llama 3 8B – Core ML

This repository contains a Core ML conversion of meta-llama/Meta-Llama-3-8B

This does not have KV Cache. only: inputs int32 / outputs float16.

I haven't been able to test this, so leave something in 'Community' to let me know how ya tested it and how it worked.

I did model.half() before scripting / coverting thinking it would reduce my memory usage (I found online that it doesn't).

I am unsure if it affected the conversion process or not.