smallstepai
/

Misal-7B-base-v0.1

Text Generation

text-generation-inference

Inference Endpoints

Model card Files Files and versions Community

sagarshf commited on Mar 20

Commit

bb2a171

•

1 Parent(s): e13de53

Update README.md

Files changed (1) hide show

README.md +0 -21

README.md CHANGED Viewed

@@ -30,27 +30,6 @@ During the pretraining phase of our large language model, the model was exposed
 Our model was pretrained using a single A100 80GB GPU on the QBlocks platform. We chose bfloat16 as training precision due to stability issues with float16 precision.
-We used Parameter efficient finetuning for pretraining, using Low Rank Adaptation (LoRA), to achieve a training loss of approximately 2.8 after training for almost 2 days.
-```python
-# LoRA config
-peft:
-  r: 64
-  lora_alpha: 128
-  target_modules:
-      [
-        "q_proj", "v_proj",
-        "k_proj", "o_proj",
-        "gate_proj", "up_proj",
-        "down_proj",
-      ]
-  lora_dropout: 0.05
-  bias: "none"
-  task_type: "CAUSAL_LM"
-  modules_to_save: ["embed_tokens", "lm_head"]
-```
 ## License
 The model inherits the license from meta-llama/Llama-2-7b.

 Our model was pretrained using a single A100 80GB GPU on the QBlocks platform. We chose bfloat16 as training precision due to stability issues with float16 precision.
 ## License
 The model inherits the license from meta-llama/Llama-2-7b.