I'm running dolly on colab with 12g ram and it quickly ran out of RAM. How big the RAM needs to be to run dolly? Can I run it on google colab or I should deploy it to AWS sagemaker?
To load the model? it's 12b params, so will need much more than 12GB of memory. I'm using a 64GB instance and it's fine. You need a relatively powerful GPU to run this reasonably anyway, so probably do want to use a proper cloud GPU instance. I use a g5.4xlarge on AWS in Databricks and it's fine, just needs to enable load_in_8bit to fit in the GPU mem