Update README.md
Browse files
README.md
CHANGED
@@ -22,6 +22,16 @@ this model is only 1B but you can call it somehow an SOTA
|
|
22 |
|
23 |
this model can also run on 4 GB GPU RAM and know dialogs as well
|
24 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
25 |
## Usage Code
|
26 |
|
27 |
```python
|
|
|
22 |
|
23 |
this model can also run on 4 GB GPU RAM and know dialogs as well
|
24 |
|
25 |
+
|
26 |
+
### Train Parametes
|
27 |
+
|
28 |
+
- learning-rate : 2e-4
|
29 |
+
- sc : cosine lr
|
30 |
+
- device : T4 GPU * 4
|
31 |
+
- batch-size: AutoFind
|
32 |
+
- train time 12 H
|
33 |
+
- max sequence length: 1024
|
34 |
+
- epochs : 2
|
35 |
## Usage Code
|
36 |
|
37 |
```python
|