YAML Metadata Warning: empty or missing yaml metadata in repo card (https://huggingface.co/docs/hub/model-cards#model-card-metadata)

run phi3-mini on AMD NPU

  1. If no phi3_mini_awq_4bit_no_flash_attention.pt, use awq quantization to get the quantization model.
  2. Put modeling_phi3.py in this repo into the phi-3-mini folder.
  3. Modify the file path in the run_awq.py
  4. run python run_awq.py --task decode --target aie --w_bit 4

reference:https://github.com/amd/RyzenAI-SW/tree/main/example/transformers

As the quantization of phi-3, may refer to https://github.com/mit-han-lab/llm-awq/pull/183

PS: The performance is similar to that on CPU(7640hs).

Downloads last month

-

Downloads are not tracked for this model. How to track
Inference Providers NEW
This model is not currently available via any of the supported Inference Providers.
The model cannot be deployed to the HF Inference API: The model has no library tag.