ISTA-DASLab/Llama-2-7b-AQLM-2Bit-8x8-hf
Text Generation • 2B • Updated • 12
None defined yet.
MatryoshkaLoRA: Learning Accurate Hierarchical Low-Rank Representations for LLM Fine-Tuning
GSQ: Highly-Accurate Low-Precision Scalar Quantization for LLMs via Gumbel-Softmax Sampling