Commit
•
49e32df
1
Parent(s):
0ec865c
Update README.md
Browse files
README.md
CHANGED
@@ -19,12 +19,14 @@ We compare the time and space efficiency of this model and some competitors. For
|
|
19 |
The experiments are implemented with an NVIDIA A100-SXM4-40GB. Batch size of 1. The figures show the time and memory needed to run one batch. In the training mode, forward pass and backpropagation is included. In the inferring model, only forward pass is included.
|
20 |
|
21 |
## Training mode
|
22 |
-
|
23 |
-
![image/png](https://cdn-uploads.huggingface.co/production/uploads/61d2d2993c2083e1c08af221/
|
|
|
24 |
|
25 |
# Inferring mode
|
26 |
-
|
27 |
-
![image/png](https://cdn-uploads.huggingface.co/production/uploads/61d2d2993c2083e1c08af221/
|
|
|
28 |
|
29 |
# Introduction
|
30 |
This is the implementation of the BERT model using the LongNet structure (paper: https://arxiv.org/pdf/2307.02486.pdf).
|
|
|
19 |
The experiments are implemented with an NVIDIA A100-SXM4-40GB. Batch size of 1. The figures show the time and memory needed to run one batch. In the training mode, forward pass and backpropagation is included. In the inferring model, only forward pass is included.
|
20 |
|
21 |
## Training mode
|
22 |
+
|
23 |
+
![image/png](https://cdn-uploads.huggingface.co/production/uploads/61d2d2993c2083e1c08af221/kbwNUDuHfsJy6FtfoekXi.png)
|
24 |
+
![image/png](https://cdn-uploads.huggingface.co/production/uploads/61d2d2993c2083e1c08af221/f-d3hhFAljYMrKkPfn2MJ.png)
|
25 |
|
26 |
# Inferring mode
|
27 |
+
|
28 |
+
![image/png](https://cdn-uploads.huggingface.co/production/uploads/61d2d2993c2083e1c08af221/9-PCSONEVOTzZgPuaPSzo.png)
|
29 |
+
![image/png](https://cdn-uploads.huggingface.co/production/uploads/61d2d2993c2083e1c08af221/q4zLyOQkZ4phmMKPiddSa.png)
|
30 |
|
31 |
# Introduction
|
32 |
This is the implementation of the BERT model using the LongNet structure (paper: https://arxiv.org/pdf/2307.02486.pdf).
|