minhtriphan commited on
Commit
49e32df
1 Parent(s): 0ec865c

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +6 -4
README.md CHANGED
@@ -19,12 +19,14 @@ We compare the time and space efficiency of this model and some competitors. For
19
  The experiments are implemented with an NVIDIA A100-SXM4-40GB. Batch size of 1. The figures show the time and memory needed to run one batch. In the training mode, forward pass and backpropagation is included. In the inferring model, only forward pass is included.
20
 
21
  ## Training mode
22
- ![image/png](https://cdn-uploads.huggingface.co/production/uploads/61d2d2993c2083e1c08af221/clg3lSItrQuXL5YYh7dmm.png)
23
- ![image/png](https://cdn-uploads.huggingface.co/production/uploads/61d2d2993c2083e1c08af221/zCwoR6oimLFEO0llErb0g.png)
 
24
 
25
  # Inferring mode
26
- ![image/png](https://cdn-uploads.huggingface.co/production/uploads/61d2d2993c2083e1c08af221/GKkLON8R1bqa7XRvOoFOp.png)
27
- ![image/png](https://cdn-uploads.huggingface.co/production/uploads/61d2d2993c2083e1c08af221/bmEHrGIaAGGwe75Msx3PL.png)
 
28
 
29
  # Introduction
30
  This is the implementation of the BERT model using the LongNet structure (paper: https://arxiv.org/pdf/2307.02486.pdf).
 
19
  The experiments are implemented with an NVIDIA A100-SXM4-40GB. Batch size of 1. The figures show the time and memory needed to run one batch. In the training mode, forward pass and backpropagation is included. In the inferring model, only forward pass is included.
20
 
21
  ## Training mode
22
+
23
+ ![image/png](https://cdn-uploads.huggingface.co/production/uploads/61d2d2993c2083e1c08af221/kbwNUDuHfsJy6FtfoekXi.png)
24
+ ![image/png](https://cdn-uploads.huggingface.co/production/uploads/61d2d2993c2083e1c08af221/f-d3hhFAljYMrKkPfn2MJ.png)
25
 
26
  # Inferring mode
27
+
28
+ ![image/png](https://cdn-uploads.huggingface.co/production/uploads/61d2d2993c2083e1c08af221/9-PCSONEVOTzZgPuaPSzo.png)
29
+ ![image/png](https://cdn-uploads.huggingface.co/production/uploads/61d2d2993c2083e1c08af221/q4zLyOQkZ4phmMKPiddSa.png)
30
 
31
  # Introduction
32
  This is the implementation of the BERT model using the LongNet structure (paper: https://arxiv.org/pdf/2307.02486.pdf).