myn11 commited on
Commit
4240f2d
·
1 Parent(s): 6813754

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +1 -1
README.md CHANGED
@@ -21,4 +21,4 @@ One possibel reason can be that gpt-2 125M is too small a model to summarize and
21
  <br>
22
  <br>
23
  Llama-3B is too big a model to train in a single T100 GPU instance with 15GB RAM. So, I employed qLoRA (quantized low-rank adapters. paper - https://arxiv.org/abs/2305.14314v1) to train it.
24
- llama-3b.ipynb file has the code for fine tuning of llama-3b. As it can be seen in training details the llama-3b fine tuning clearly beats the gpt-2 and reasonably so.
 
21
  <br>
22
  <br>
23
  Llama-3B is too big a model to train in a single T100 GPU instance with 15GB RAM. So, I employed qLoRA (quantized low-rank adapters. paper - https://arxiv.org/abs/2305.14314v1) to train it.
24
+ llama-3b.ipynb file has the code for fine tuning of llama-3b. As it can be seen in training details the Llama-3b fine tuning clearly beats the gpt-2 and reasonably so.