--- license: apache-2.0 datasets: - abideen/Cosmopedia-100k-pretrain language: - en tags: - llama3 - meta - bitnet - full bitnet --- needs work. it appears that the 1bit models are far less efficient when scare in terms of memory usage