Edit model card
YAML Metadata Warning: empty or missing yaml metadata in repo card (https://huggingface.co/docs/hub/model-cards#model-card-metadata)

A 2L, width 736 SoLU model trained on 15B tokens of the Pile. Bugs: the layernorm just before the unembed is an RMS norm, and the width is not a multiple of 64, so d_head=64 and n_heads=11, and n_heads * d_head != d_model :(

Downloads last month
231
Unable to determine this model’s pipeline type. Check the docs .