Xwin-LM-70B + LimaRP Lora v3

Runs smoothly on single 3090 in webui with context length set to 4096, ExLlamav2_HF loader and cache_8bit=True

All comments are greatly appreciated, download, test and if you appreciate my work, consider buying me my fuel:

Model is too large to load in Inference API (serverless). To try the model, launch it on Inference Endpoints (dedicated) instead.

Collection including TeeZee/Xwin-LM-70B-V0.1_Limarpv3-bpw2.4-h6