--- datasets: - nilq/babylm-10M language: - en --- - GPT-2 model submitted by team CLAUSE Bielefeld to the BabyLM challenge 2023 - implements a very naive curriculum learning approach inspired by usage-based linguistics: training examples are ordered according to complexity measures from research on child-directed speech (please consult paper for more info) Citation: ``` @inproceedings{bunzeck-zarriess-2023-gpt, title = "{GPT}-wee: How Small Can a Small Language Model Really Get?", author = "Bunzeck, Bastian and Zarrie{\ss}, Sina", editor = "Warstadt, Alex and Mueller, Aaron and Choshen, Leshem and Wilcox, Ethan and Zhuang, Chengxu and Ciro, Juan and Mosquera, Rafael and Paranjabe, Bhargavi and Williams, Adina and Linzen, Tal and Cotterell, Ryan", booktitle = "Proceedings of the BabyLM Challenge at the 27th Conference on Computational Natural Language Learning", month = dec, year = "2023", address = "Singapore", publisher = "Association for Computational Linguistics", url = "https://aclanthology.org/2023.conll-babylm.2", doi = "10.18653/v1/2023.conll-babylm.2", pages = "35--46", } ```