Edit model card

MinGPT

Normal GPT 2 Architecture with below config trained on subset of openwebtext

    n_ctx=256,
    n_positions = 256,
    n_layer = 6,
    n_embd = 384,
    n_head = 6,
Downloads last month
17

Dataset used to train eswardivi/mingpt-openwebtext