English

LoNAS Model Card: lonas-bloomz-7b-math

The super-network fine-tuned on BLOOMZ-7B with some math reasoning datasets using LoNAS.

Model Details

Information

Adapter Configuration

  • LoRA rank: 32
  • LoRA alpha: 64
  • LoRA target modules: query_key_value, dense_h_to_4h, dense_4h_to_h

Training Hyperparameters

  • Batch size: 16
  • Learning rate: 3e-4
  • Epoch: 8

Training Data

Unified math reasoning dataset: math_10k.json (collected with the training sets of GSM8K, MAWPS, and AQuA).

Evaluation Data

GSM8K, AQuA, MAWPS and SVAMP

How to use

Refer to https://github.com/IntelLabs/Hardware-Aware-Automated-Machine-Learning/tree/main/LoNAS#evaluation:

CUDA_VISIBLE_DEVICES=${DEVICES} python run_math.py \
    --dataset_path None \
    --model_name_or_path bigscience/bloomz-7b1 \
    --lora \
    --lora_weights lonas-bloomz-7b-math \
    --nncf_config nncf_config/unified_math/nncf_lonas_bloomz_7b.json \
    --do_test \
    --output_dir lonas-bloomz-7b-math/results

Evaluation Results

Results of the heuristic sub-network discoverd from the super-network:

Method Total Params. TFLOPs GSM8K AQuA MAWPS SVAMP Average
LoRA 7.1B 1.8 17.4 21.3 70.2 41.0 37.5
LoNAS 6.1B 1.5 18.6 22.0 76.5 31.8 37.2

Model Sources

Repository: https://github.com/IntelLabs/Hardware-Aware-Automated-Machine-Learning/tree/main/LoNAS

Paper:

Citation

@inproceedings{munoz-etal-2024-lonas,
    title = "{L}o{NAS}: Elastic Low-Rank Adapters for Efficient Large Language Models",
    author = "Munoz, Juan Pablo  and
      Yuan, Jinjie  and
      Zheng, Yi  and
      Jain, Nilesh",
    editor = "Calzolari, Nicoletta  and
      Kan, Min-Yen  and
      Hoste, Veronique  and
      Lenci, Alessandro  and
      Sakti, Sakriani  and
      Xue, Nianwen",
    booktitle = "Proceedings of the 2024 Joint International Conference on Computational Linguistics, Language Resources and Evaluation (LREC-COLING 2024)",
    month = may,
    year = "2024",
    address = "Torino, Italia",
    publisher = "ELRA and ICCL",
    url = "https://aclanthology.org/2024.lrec-main.940",
    pages = "10760--10776",
}

License

Apache-2.0

Downloads last month

-

Downloads are not tracked for this model. How to track
Inference Providers NEW
This model isn't deployed by any Inference Provider. ๐Ÿ™‹ Ask for provider support