Tim Dolan

macadeliccc

AI & ML interests

None yet

Organizations

Posts 7

view post
Post
2844
Fine tune Phi-3 using samatha themed dataset and Huggingface SFT trainer!

In this colab, we simply apply a supervised finetune to phi-3 using the sharegpt format.

def formatting_prompts_func(examples):
    convos = examples["conversations"]
    texts = []
    mapper = {"system": "system\n", "human": "\nuser\n", "gpt": "\nassistant\n"}
    end_mapper = {"system": "", "human": "", "gpt": ""}
    for convo in convos:
        text = "".join(f"{mapper[(turn := x['from'])]} {x['value']}\n{end_mapper[turn]}" for x in convo)
        texts.append(f"{text}{EOS_TOKEN}")  
    return {"text": texts}

dataset = dataset.map(formatting_prompts_func, batched=True)
print(dataset['text'][8])

Opus Samantha consists of 1848 samples with the samantha personality. The dataset covers a wide variety of topics such as logical reasoning, mathematics, legal, and rp.

This notebook serves as a viable option to finetune Phi-3 until Unsloth supports phi-3, which should be very soon. When that happens check out AutoSloth for both SFT, DPO, and langfuse format RAG fine tuning on free tier colab hardware.

Resources:
Dataset: macadeliccc/opus_samantha
Colab: https://colab.research.google.com/drive/1e8LILflDQ2Me52hwS7uIfuJ9DxE2oQzM?usp=sharing
AutoSloth: https://colab.research.google.com/drive/1Zo0sVEb2lqdsUm9dy2PTzGySxdF9CNkc#scrollTo=bpimlPXVz-CZ
view post
Post
Quantize 7B paramater models in 60 seconds using Half Quadratic Quantization (HQQ).

This game-changing technique allows for rapid quantization of models like Llama-2-70B in under 5 minutes, outperforming traditional methods by 50x in speed and offering high-quality compression without calibration data.

Mobius Labs innovative approach not only significantly reduces memory requirements but also enables the use of large models on consumer-grade GPUs, paving the way for more accessible and efficient machine learning research.

Mobius Labs' method utilizes a robust optimization formulation to determine the optimal quantization parameters, specifically targeting the minimization of errors between original and dequantized weights. This involves employing a loss function that promotes sparsity and utilizes a non-convex lp<1-norm, making the problem challenging yet solvable through a Half-Quadratic solver.

This solver simplifies the problem by introducing an extra variable and dividing the optimization into manageable sub-problems. Their implementation cleverly fixes the scale parameter to simplify calculations and focuses on optimizing the zero-point, utilizing closed-form solutions for each sub-problem to bypass the need for gradient calculations.

Check out the colab demo where you are able to quantize models (text generation and multimodal) for use with vLLM or Timm backend as well as transformers!

AutoHQQ: 👉 https://colab.research.google.com/drive/1cG_5R_u9q53Uond7F0JEdliwvoeeaXVN?usp=sharing
Code: https://github.com/mobiusml/hqq
HQQ Blog post: https://mobiusml.github.io/hqq_blog/

Edit: Here is an example of how powerful HQQ can be: macadeliccc/Nous-Hermes-2-Mixtral-8x7B-DPO-HQQ

Citations:

@misc {badri2023hqq,
title = {Half-Quadratic Quantization of Large Machine Learning Models},
url = {https://mobiusml.github.io/hqq_blog/},
author = {Hicham Badri and Appu Shaji},
month = {November},
year = {2023}
}