Dolma: an Open Corpus of Three Trillion Tokens for Language Model Pretraining Research Paper • 2402.00159 • Published Jan 31 • 61
Catwalk: A Unified Language Model Evaluation Framework for Many Datasets Paper • 2312.10253 • Published Dec 15, 2023 • 7
Paloma: A Benchmark for Evaluating Language Model Fit Paper • 2312.10523 • Published Dec 16, 2023 • 12
Camels in a Changing Climate: Enhancing LM Adaptation with Tulu 2 Paper • 2311.10702 • Published Nov 17, 2023 • 18
BottleFit: Learning Compressed Representations in Deep Neural Networks for Effective and Efficient Split Computing Paper • 2201.02693 • Published Jan 7, 2022
Split Computing for Complex Object Detectors: Challenges and Preliminary Results Paper • 2007.13312 • Published Jul 27, 2020
Neural Compression and Filtering for Edge-assisted Real-time Object Detection in Challenged Networks Paper • 2007.15818 • Published Jul 31, 2020
torchdistill: A Modular, Configuration-Driven Framework for Knowledge Distillation Paper • 2011.12913 • Published Nov 25, 2020
Split Computing and Early Exiting for Deep Learning Applications: Survey and Research Challenges Paper • 2103.04505 • Published Mar 8, 2021
Ensemble Transformer for Efficient and Accurate Ranking Tasks: an Application to Question Answering Systems Paper • 2201.05767 • Published Jan 15, 2022
SC2 Benchmark: Supervised Compression for Split Computing Paper • 2203.08875 • Published Mar 16, 2022
torchdistill Meets Hugging Face Libraries for Reproducible, Coding-Free Deep Learning Studies: A Case Study on NLP Paper • 2310.17644 • Published Oct 26, 2023
Cross-Lingual Knowledge Distillation for Answer Sentence Selection in Low-Resource Languages Paper • 2305.16302 • Published May 25, 2023
Rethinking Symbolic Regression Datasets and Benchmarks for Scientific Discovery Paper • 2206.10540 • Published Jun 21, 2022
How Far Can Camels Go? Exploring the State of Instruction Tuning on Open Resources Paper • 2306.04751 • Published Jun 7, 2023 • 5
What Language Model Architecture and Pretraining Objective Work Best for Zero-Shot Generalization? Paper • 2204.05832 • Published Apr 12, 2022
What Language Model to Train if You Have One Million GPU Hours? Paper • 2210.15424 • Published Oct 27, 2022 • 2