cognitivecomputations
/

dolphin-2.9-llama3-8b-256k

Text Generation

text-generation-inference

Inference Endpoints

Model card Files Files and versions Community

Was trying to quantize to 8 bits to reduce VRAM footprint. Got the stuff below.

#3

by BigDeeper - opened May 1, 2024

May 1, 2024

This comment has been hidden

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment