Text Generation
Transformers
Safetensors
English
stablelm
causal-lm
conversational
Inference Endpoints

Unable to use on free tier Google Colab

#2
by sudhir2016 - opened

Tried using on free tier Colab with int8 quantization using Quanto. Model loads but runs out of RAM on inference. Then tried int4 quantization. At inference it just keeps running endlessly. Waited for 20 minutes then gave up.

Sign up or log in to comment