--- license: cc0-1.0 datasets: - JeanKaddour/minipile language: - en library_name: transformers --- GPT-NeoX trained on MiniPile, for a baseline to compare my MANN models against. Uses [NeelNanda/gpt-neox-tokenizer-digits](https://huggingface.co/NeelNanda/gpt-neox-tokenizer-digits) for tokenization. The exact model configuration is as follows: ``` cfg = GPTNeoXConfig( vocab_size = len(tokenizer), hidden_size = 768, intermediate_size = 768*4, num_hidden_layers = 12, num_attention_heads = 12, tie_word_embeddings = True, hidden_act = "gelu_new", tokenizer = "NeelNanda/gpt-neox-tokenizer-digits" ) ```