nemo.deploy missing file

#1
by AlirezaAR - opened

I've been encountering difficulties installing and using nemo.deploy. I've installed nemo_toolkit through Docker, pip and Cloned your github.But each time I've gotten the missing file error for import nemo.deploy.
Is there any specific library or additional package required or specific version for nemo.deploy that I might be missing?
Thanks,

You need to use this container: https://registry.ngc.nvidia.com/orgs/ea-bignlp/teams/ga-participants/containers/nemofw-inference (image at nvcr.io/ea-bignlp/ga-participants/nemofw-inference:23.10) -- requires prior registration at https://developer.nvidia.com/nemo-framework

You need to use this container: https://registry.ngc.nvidia.com/orgs/ea-bignlp/teams/ga-participants/containers/nemofw-inference (image at nvcr.io/ea-bignlp/ga-participants/nemofw-inference:23.10) -- requires prior registration at https://developer.nvidia.com/nemo-framework

Thanks,
For others who wants to run this image, setting two variables are necessary for runnig NemoTron image correctly.

1- LD_LIBRARY_PATH to "/usr/local/tensorrt/targets/x86_64-linux-gnu/lib" :
export LD_LIBRARY_PATH="/usr/local/tensorrt/targets/x86_64-linux-gnu/lib:$LD_LIBRARY_PATH"

2- OPAL_PREFIX to "/opt/hpcx/ompi" :
export OPAL_PREFIX="/opt/hpcx/ompi"

NVIDIA org
β€’
edited Jan 5

Thanks for sharing @AlirezaAR , can you please share the value of these two env variables before you set them?

(edit: normally this shouldn't be needed -- would be good to also know what error you run into without setting them)

Of course, as I remember there was not OPAL_PREFIX variable.
@odelalleau

NVIDIA org

@AlirezaAR that's weird, I just tested it and when I start a bash session within the 23.10 inference container, I can see OPAL_PREFIX being set to /opt/hpcx/ompi.... no idea why you would see something different on your side :/

Sign up or log in to comment