Edit model card

Convert from TinyLlama/TinyLlama-1.1B-Chat-v1.0 and 4 bits quantized.

Require onnxruntime>=0.17.0

Downloads last month
4
Inference API
Input a message to start chatting with BricksDisplay/TinyLlama-1.1B-Chat-v1.0-q4.
Inference API (serverless) does not yet support transformers.js models for this pipeline type.

Finetuned from

Collection including BricksDisplay/TinyLlama-1.1B-Chat-v1.0-q4