AISE-TUDelft/Custom-Activations-BERT-Adaptive-GELU
Fill-Mask
•
Updated
•
7
Models for the 2024-Q4 BSc. Research Project: "Architectural Decisions for Language Modelling with Small Transformers".