BigMaid-20B-v2.0

exllamav2 quant for TeeZee/BigMaid-20B-v2.0 using specialized dataset openerotica/erotiquant

Runs smoothly on single 3090 in webui with context length set to 4096, ExLlamav2_HF loader and cache_8bit=True

All comments are greatly appreciated, download, test and if you appreciate my work, consider buying me my fuel:

Downloads last month: 17

Inference Examples

Text Generation

This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social visibility and check back later, or deploy to Inference Endpoints (dedicated) instead.

Collections including TeeZee/BigMaid-20B-v2.0-bpw8.0-h8-exl2

24 GB VRAM

Collection

Quants that run fast on single 3090/4090 card with 24GB of VRAM and 4096 context length • 18 items • Updated Aug 14, 2024 • 6

BigMaid

Collection

Fun to talk to models. • 4 items • Updated Jun 14, 2024