Upload README.md with huggingface_hub
Browse files
README.md
CHANGED
@@ -50,23 +50,6 @@ from dual_attention.hf import DualAttnTransformerLM_HFHub
|
|
50 |
DualAttnTransformerLM_HFHub.from_pretrained('awni00/DAT-sa8-ra8-nr32-ns1024-sh8-nkvh4-343M')
|
51 |
```
|
52 |
|
53 |
-
Alternatively, you can download the pytorch checkpoint containing the state dict.
|
54 |
-
|
55 |
-
To download the PyTorch checkpoint, run:
|
56 |
-
```wget https://huggingface.co/awni00/DAT-sa8-ra8-nr32-ns1024-sh8-nkvh4-343M/resolve/main/pytorch_checkpoint.pt```
|
57 |
-
|
58 |
-
Then, you can load model weights via:
|
59 |
-
```
|
60 |
-
from dual_attention.language_models import DualAttnTransformerLM
|
61 |
-
|
62 |
-
ckpt = torch.load(ckpt_path)
|
63 |
-
model_config = ckpt['config']
|
64 |
-
model_state_dict = ckpt['model']
|
65 |
-
|
66 |
-
model = DualAttnTransformerLM(**model_config)
|
67 |
-
model.load_state_dict(model_state_dict)
|
68 |
-
```
|
69 |
-
|
70 |
## Training Details
|
71 |
|
72 |
The model was trained using the following setup:
|
|
|
50 |
DualAttnTransformerLM_HFHub.from_pretrained('awni00/DAT-sa8-ra8-nr32-ns1024-sh8-nkvh4-343M')
|
51 |
```
|
52 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
53 |
## Training Details
|
54 |
|
55 |
The model was trained using the following setup:
|