Update README.md
Browse files
README.md
CHANGED
@@ -21,4 +21,4 @@ One possibel reason can be that gpt-2 125M is too small a model to summarize and
|
|
21 |
<br>
|
22 |
<br>
|
23 |
Llama-3B is too big a model to train in a single T100 GPU instance with 15GB RAM. So, I employed qLoRA (quantized low-rank adapters. paper - https://arxiv.org/abs/2305.14314v1) to train it.
|
24 |
-
llama-3b.ipynb file has the code for fine tuning of llama-3b. As it can be seen in training details the
|
|
|
21 |
<br>
|
22 |
<br>
|
23 |
Llama-3B is too big a model to train in a single T100 GPU instance with 15GB RAM. So, I employed qLoRA (quantized low-rank adapters. paper - https://arxiv.org/abs/2305.14314v1) to train it.
|
24 |
+
llama-3b.ipynb file has the code for fine tuning of llama-3b. As it can be seen in training details the Llama-3b fine tuning clearly beats the gpt-2 and reasonably so.
|