sagarshf commited on
Commit
bb2a171
1 Parent(s): e13de53

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +0 -21
README.md CHANGED
@@ -30,27 +30,6 @@ During the pretraining phase of our large language model, the model was exposed
30
 
31
  Our model was pretrained using a single A100 80GB GPU on the QBlocks platform. We chose bfloat16 as training precision due to stability issues with float16 precision.
32
 
33
- We used Parameter efficient finetuning for pretraining, using Low Rank Adaptation (LoRA), to achieve a training loss of approximately 2.8 after training for almost 2 days.
34
-
35
- ```python
36
- # LoRA config
37
-
38
- peft:
39
- r: 64
40
- lora_alpha: 128
41
- target_modules:
42
- [
43
- "q_proj", "v_proj",
44
- "k_proj", "o_proj",
45
- "gate_proj", "up_proj",
46
- "down_proj",
47
- ]
48
- lora_dropout: 0.05
49
- bias: "none"
50
- task_type: "CAUSAL_LM"
51
- modules_to_save: ["embed_tokens", "lm_head"]
52
- ```
53
-
54
  ## License
55
 
56
  The model inherits the license from meta-llama/Llama-2-7b.
 
30
 
31
  Our model was pretrained using a single A100 80GB GPU on the QBlocks platform. We chose bfloat16 as training precision due to stability issues with float16 precision.
32
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
33
  ## License
34
 
35
  The model inherits the license from meta-llama/Llama-2-7b.