Successfully run on GPU with DirectML

by Pengan - opened Oct 4, 2023

Oct 4, 2023

•

edited Oct 4, 2023

First install the DirectML ONNX Runtime
pip install onnxruntime-directml
Then in the model.pyfile
Changeproviders = ["CPUExecutionProvider"] to providers = ["DmlExecutionProvider"]
It will work with GPU, verified with RTX4060 8GB and 7.7GB VRAM consumed.
Also tested with Intel i5 7200U's HD620 iGPU, running slowly at 5~6s per token but generate correct content with no problem.

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment