ggml-org/gguf-my-repo · Discussions

Add F16 and BF16 quantization

#129 opened 1 day ago by

andito

update readme for card generation

#128 opened 15 days ago by

ariG23498

[bug] asymmetric t5 models fail to quantize

#126 opened about 1 month ago by

pszemraj

[Bug] Extra files with related name were uploaded to the resulting repository

#125 opened about 1 month ago by

Felladrin

Issue converting PEFT LoRA fine tuned model to GGUF

2

#124 opened about 1 month ago by

AdnanRiaz107

Issue converting nvidia/NV-Embed-v2 to GGUF

#123 opened about 2 months ago by

redshiva

Issue converting FLUX.1-dev model to GGUF format

2

#122 opened about 2 months ago by

cbrescia

Add Llama 3.1 license

#121 opened about 2 months ago by

jxtngx

Add an option to put all quantization variants in the same repo

#120 opened about 2 months ago by

A2va

Phi-3.5-MoE-instruct

6

#117 opened 2 months ago by

goodasdgood

Fails to quntize T5 (xl and xxl) models

1

#116 opened 2 months ago by

girishponkiya

Arm optimized quants

1

#113 opened 2 months ago by

SaisExperiments

DeepseekForCausalLM is not supported

1

#112 opened 3 months ago by

nanowell

Please, update converting script. Llama.cpp added support for Nemotron and Minitron architectures.

3

#111 opened 3 months ago by

NikolayKozloff

Enable the created name repo to be without the quantization type

#110 opened 3 months ago by

A2va

I think I broke the space quantizing 4bit modle with Q4L

#106 opened 3 months ago by

hellork

Authorship Metadata support added to converter script, you may want to add the ability to add metadata overrides

3

#104 opened 3 months ago by

mofosyne

Please support this method:

7

#96 opened 4 months ago by

ZeroWw

Support Q2 imatrix quants

#95 opened 4 months ago by

Dampfinchen

Maybe impose a max model size?

3

#33 opened 7 months ago by

pcuenq