Create readme.md
Browse files
readme.md
ADDED
@@ -0,0 +1,13 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
1 |
+
# Base model is LLAMA 3.1 8B
|
2 |
+
## Modifications:
|
3 |
+
1. Quantization to INT4 for training on COLAB A100 GPU with 40GB of VRAM
|
4 |
+
2. LORA for parameter-efficient-fine-tuning which allowed attaching an adapter that was customized for specific task.
|
5 |
+
|
6 |
+
## Observations:
|
7 |
+
1. Initial model does not have enough of predictive power to distinguish each entry that is passed during inference
|
8 |
+
2. Adapters indeed adapt the model for specific tasks, which was evident, when model changed its predictions towards the majority-class instead of random prediction during inference.
|
9 |
+
3. Requirement is easy, adapt the model and passed data to create some predictive power.
|
10 |
+
|
11 |
+
## Actions:
|
12 |
+
- Use 70B model
|
13 |
+
-
|