Catastrophic forgetting test results:

Initial evaluation loss on 1k subset of HuggingFaceTB/cosmopedia-100k dataset was 1.102. 100 steps of LISA training reduced this to 1.049.

Comparison to control: cosmo-1b started out with 1.003 loss on (a different subset of) dataset, increasing to 1.024 at 100 steps.

Axolotl config: Same as qdora version but without dora.

Downloads last month
77
Inference Providers NEW
This model is not currently available via any of the supported Inference Providers.

Model tree for Lambent/cosmo-1b-qlora-pythontest

Merges
2 models