LoNAS Model Card: lonas-bloomz-7b-math

The super-network fine-tuned on BLOOMZ-7B with some math reasoning datasets using LoNAS.

Model Details

Information

Adapter Configuration

  • LoRA rank: 32
  • LoRA alpha: 64
  • LoRA target modules: query_key_value, dense_h_to_4h, dense_4h_to_h

Training Hyperparameters

  • Batch size: 16
  • Learning rate: 3e-4
  • Epoch: 8

Training Data

Unified math reasoning dataset: math_10k.json (collected with the training sets of GSM8K, MAWPS, and AQuA).

Evaluation Data

GSM8K, AQuA, MAWPS and SVAMP

How to use

Refer to https://github.com/IntelLabs/Hardware-Aware-Automated-Machine-Learning/tree/main/LoNAS#evaluation:

CUDA_VISIBLE_DEVICES=${DEVICES} python run_math.py \
    --dataset_path None \
    --model_name_or_path bigscience/bloomz-7b1 \
    --lora \
    --lora_weights lonas-bloomz-7b-math \
    --nncf_config nncf_config/unified_math/nncf_lonas_bloomz_7b.json \
    --do_test \
    --output_dir lonas-bloomz-7b-math/results

Evaluation Results

Results of the heuristic sub-network discoverd from the super-network:

Method Total Params. TFLOPs GSM8K AQuA MAWPS SVAMP Average
LoRA 7.1B 1.8 17.4 21.3 70.2 41.0 37.5
LoNAS 6.1B 1.5 18.6 22.0 76.5 31.8 37.2

Model Sources

Citation

@article{munoz2024lonas,
  title = {LoNAS: Elastic Low-Rank Adapters for Efficient Large Language Models},
  author={J. Pablo Munoz and Jinjie Yuan and Yi Zheng and Nilesh Jain},
  journal={},
  year={2024}
}

License

Apache-2.0

Downloads last month

-

Downloads are not tracked for this model. How to track
Inference API
Unable to determine this model's library. Check the docs .