Edit model card

ezrkllm-collection

Collection of LLMs compatible with Rockchip's chips using their rkllm-toolkit. This repo contains the converted models for running on the RK3588 NPU found in SBCs like Orange Pi 5, NanoPi R6 and Radxa Rock 5.

Check the main repo on GitHub for how to install and use: https://github.com/Pelochus/ezrknpu

Available LLMs

Before running any LLM, take into account that the required RAM is between 1.5-3 times the model size (this is an estimation, haven't done extensive testing yet).

Right now, only converted the following models:

Llama 2 was converted using Azure servers. For reference, converting Phi-2 peaked at about 15 GBs of RAM + 25 GBs of swap (counting OS, but that was using about 2 GBs max). Converting Llama 2 7B peaked at about 32 GBs of RAM + 50 GB of swap.

Downloading a model

Use:

git clone LINK_FROM_PREVIOUS_TABLE_HERE

And then (may not be necessary):

git lfs pull

If the first clone gives you problems (takes too long) you can also:

GIT_LFS_SKIP_SMUDGE=1 git clone LINK_FROM_PREVIOUS_TABLE_HERE

And then 'git lfs pull' inside the cloned folder to download the full model.

RKLLM parameters used

RK3588 only supports w8a8 quantization, so that was the selected quantization for ALL models. Aside from that, RKLLM toolkit allows for no optimization (0) and optimization (1). All models are optimized.

Future additions

  • Converting other compatible LLMs
  • Adding other compatible Rockchip's SoCs

More info

Downloads last month
0
Inference Examples
Unable to determine this model's library. Check the docs .