uukuguy's picture
Update README.md
a1e6730
metadata
license: llama2

Experiment for DARE(Drop and REscale), most of the delta parameters can be directly set to zeros without affecting the capabilities of SFT LMs and larger models can tolerate a higher proportion of discarded parameters.

Model Average ARC HellaSwag MMLU TruthfulQA Winogrande GSM8K DROP
bhenrym14/mistral-7b-platypus-fp16 56.89 63.05 84.15 64.11 45.07 78.53 17.36 45.92