Update the instructions to make it consistent with the demo page. Also we just need A10G-Small as the preload is now removed and CPU RAM requirement is the same as T4 (15G).

btw I just tried one more thing, 8-bit seems to be slower than both 4-bit and 16-bit on A10G. Not sure for A10G we recommend for 4-bit or 16-bit now.

nit: a small typo: A10G-Medium (24G) There is no medium variant :)

Also we just need A10G-Small as the preload is now removed and CPU RAM requirement is the same as T4 (15G).

you mean for the 8-bit inference, right? 4-bit seems to working well with a T4-Small

btw I just tried one more thing, 8-bit seems to be slower than both 4-bit and 16-bit on A10G. Not sure for A10G we recommend for 4-bit or 16-bit now.

maybe we should add this in the description section (above or under the recommended configurations), until we figured out what could be the issue, wdyt?

nit: a small typo: A10G-Medium (24G) There is no medium variant :)

Also we just need A10G-Small as the preload is now removed and CPU RAM requirement is the same as T4 (15G).

you mean for the 8-bit inference, right? 4-bit seems to working well with a T4-Small

yep. i mean that 15G CPU RAM is enough. So A10G-large is not needed, small is enough.

btw I just tried one more thing, 8-bit seems to be slower than both 4-bit and 16-bit on A10G. Not sure for A10G we recommend for 4-bit or 16-bit now.

maybe we should add this in the description section (above or under the recommended configurations), until we figured out what could be the issue, wdyt?

Totally agree, updated the description, and please feel free to change / add comments if you find neceesary. Thanks.

Also, what's your twitter account? Would be great to tag you for the HF Space when we tweet about it later :)

it looks good to me, thanks for the improvements!
merging it if there are no more changes.

tw account is @dotglub :)

Please proceed, thanks!

badayvedat changed pull request status to merged

Sign up or log in to comment