--- license: mit --- This one with a custom `config.head_dim` as allowed by the architecture (see 7b model).