asi commited on
Commit
d0331cb
1 Parent(s): d2e72e4

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +4 -5
README.md CHANGED
@@ -10,20 +10,20 @@ Implementation of the paper "How Many Layers and Why? An Analysis of the Model D
10
 
11
  ## Model architecture
12
 
13
- We augment a multi-layer transformer encoder with a halting mechanism, which allows dynamically adjusting the number of layers for each token.
14
  We directly adapted this mechanism from Graves ([2016](#graves-2016)). At each iteration, we compute a probability for each token to stop updating its state.
15
 
16
  ## Model use
17
 
18
- The architecture is not yet directly included in the Transformers library. So you shoud install the code implementation first:
19
 
20
  ```bash
21
  pip install git+https://github.com/AntoineSimoulin/adaptive-depth-transformers
22
  ```
23
 
24
- Then You can you se model directly
25
 
26
- ```pyhton
27
  import sys
28
  sys.path.append('adaptative-depth-transformers')
29
 
@@ -41,7 +41,6 @@ outputs.updates
41
  # tensor([[[[15., 9., 10., 7., 3., 8., 5., 7., 12., 10., 6., 8., 8., 9., 5., 8.]]]])
42
  ```
43
 
44
-
45
  ## Citations
46
 
47
  ### BibTeX entry and citation info
 
10
 
11
  ## Model architecture
12
 
13
+ We augment a multi-layer transformer encoder with a halting mechanism, which dynamically adjusts the number of layers for each token.
14
  We directly adapted this mechanism from Graves ([2016](#graves-2016)). At each iteration, we compute a probability for each token to stop updating its state.
15
 
16
  ## Model use
17
 
18
+ The architecture is not yet directly included in the Transformers library. The code used for pre-training is available in the following [github repository](https://github.com/AntoineSimoulin/adaptive-depth-transformers). So you should install the code implementation first:
19
 
20
  ```bash
21
  pip install git+https://github.com/AntoineSimoulin/adaptive-depth-transformers
22
  ```
23
 
24
+ Then you can use the model directly.
25
 
26
+ ```python
27
  import sys
28
  sys.path.append('adaptative-depth-transformers')
29
 
 
41
  # tensor([[[[15., 9., 10., 7., 3., 8., 5., 7., 12., 10., 6., 8., 8., 9., 5., 8.]]]])
42
  ```
43
 
 
44
  ## Citations
45
 
46
  ### BibTeX entry and citation info