yuvraj17
/

EvolCodeLlama-3.1-8B-Instruct

Safetensors

llama

Model card Files Files and versions Community

yuvraj17 commited on Aug 28, 2024

Commit

84e06d9

verified ·

1 Parent(s): 2d0ba28

Update README.md

Browse files

Files changed (1) hide show

README.md +7 -100

README.md CHANGED Viewed

@@ -97,19 +97,9 @@ This model is a fine-tuned version of [meta-llama/Meta-Llama-3.1-8B-Instruct](ht
 It achieves the following results on the evaluation set:
 - Loss: 0.4057
-## Model description
-More information needed
-## Intended uses & limitations
-More information needed
-## Training and evaluation data
-More information needed
-## Training procedure
 ### Training hyperparameters
@@ -125,94 +115,11 @@ The following hyperparameters were used during training:
 - lr_scheduler_warmup_steps: 100
 - num_epochs: 3
-### Training results
-| Training Loss | Epoch  | Step | Validation Loss |
-|:-------------:|:------:|:----:|:---------------:|
-| 0.388         | 0.0120 | 1    | 0.4443          |
-| 0.3646        | 0.0359 | 3    | 0.4441          |
-| 0.3216        | 0.0719 | 6    | 0.4439          |
-| 0.3628        | 0.1078 | 9    | 0.4435          |
-| 0.2506        | 0.1437 | 12   | 0.4417          |
-| 0.2855        | 0.1796 | 15   | 0.4379          |
-| 0.2472        | 0.2156 | 18   | 0.4310          |
-| 0.3146        | 0.2515 | 21   | 0.4243          |
-| 0.2829        | 0.2874 | 24   | 0.4185          |
-| 0.2926        | 0.3234 | 27   | 0.4139          |
-| 0.3832        | 0.3593 | 30   | 0.4099          |
-| 0.3           | 0.3952 | 33   | 0.4069          |
-| 0.2759        | 0.4311 | 36   | 0.4051          |
-| 0.341         | 0.4671 | 39   | 0.4017          |
-| 0.2268        | 0.5030 | 42   | 0.3989          |
-| 0.3938        | 0.5389 | 45   | 0.3971          |
-| 0.3478        | 0.5749 | 48   | 0.3951          |
-| 0.2745        | 0.6108 | 51   | 0.3935          |
-| 0.2623        | 0.6467 | 54   | 0.3920          |
-| 0.3743        | 0.6826 | 57   | 0.3903          |
-| 0.3205        | 0.7186 | 60   | 0.3898          |
-| 0.332         | 0.7545 | 63   | 0.3897          |
-| 0.268         | 0.7904 | 66   | 0.3876          |
-| 0.2842        | 0.8263 | 69   | 0.3873          |
-| 0.3677        | 0.8623 | 72   | 0.3868          |
-| 0.212         | 0.8982 | 75   | 0.3857          |
-| 0.2656        | 0.9341 | 78   | 0.3854          |
-| 0.2499        | 0.9701 | 81   | 0.3844          |
-| 0.3512        | 1.0060 | 84   | 0.3850          |
-| 0.3069        | 1.0269 | 87   | 0.3848          |
-| 0.3037        | 1.0629 | 90   | 0.3856          |
-| 0.2785        | 1.0988 | 93   | 0.3864          |
-| 0.206         | 1.1347 | 96   | 0.3873          |
-| 0.3354        | 1.1707 | 99   | 0.3912          |
-| 0.3281        | 1.2066 | 102  | 0.3882          |
-| 0.3452        | 1.2425 | 105  | 0.3849          |
-| 0.3153        | 1.2784 | 108  | 0.3851          |
-| 0.3846        | 1.3144 | 111  | 0.3851          |
-| 0.2847        | 1.3503 | 114  | 0.3842          |
-| 0.3128        | 1.3862 | 117  | 0.3842          |
-| 0.282         | 1.4222 | 120  | 0.3866          |
-| 0.2186        | 1.4581 | 123  | 0.3876          |
-| 0.2122        | 1.4940 | 126  | 0.3862          |
-| 0.2877        | 1.5299 | 129  | 0.3837          |
-| 0.2771        | 1.5659 | 132  | 0.3822          |
-| 0.3518        | 1.6018 | 135  | 0.3820          |
-| 0.302         | 1.6377 | 138  | 0.3829          |
-| 0.2653        | 1.6737 | 141  | 0.3833          |
-| 0.3281        | 1.7096 | 144  | 0.3832          |
-| 0.2933        | 1.7455 | 147  | 0.3821          |
-| 0.1959        | 1.7814 | 150  | 0.3824          |
-| 0.2013        | 1.8174 | 153  | 0.3830          |
-| 0.1909        | 1.8533 | 156  | 0.3824          |
-| 0.2321        | 1.8892 | 159  | 0.3812          |
-| 0.2695        | 1.9251 | 162  | 0.3798          |
-| 0.2516        | 1.9611 | 165  | 0.3796          |
-| 0.2148        | 1.9970 | 168  | 0.3796          |
-| 0.2233        | 2.0180 | 171  | 0.3802          |
-| 0.234         | 2.0539 | 174  | 0.3844          |
-| 0.2615        | 2.0898 | 177  | 0.3938          |
-| 0.1582        | 2.1257 | 180  | 0.4031          |
-| 0.218         | 2.1617 | 183  | 0.4071          |
-| 0.2438        | 2.1976 | 186  | 0.4072          |
-| 0.1822        | 2.2335 | 189  | 0.4050          |
-| 0.2163        | 2.2695 | 192  | 0.4028          |
-| 0.1513        | 2.3054 | 195  | 0.4021          |
-| 0.1898        | 2.3413 | 198  | 0.4031          |
-| 0.1857        | 2.3772 | 201  | 0.4059          |
-| 0.1909        | 2.4132 | 204  | 0.4075          |
-| 0.1119        | 2.4491 | 207  | 0.4092          |
-| 0.1794        | 2.4850 | 210  | 0.4091          |
-| 0.1188        | 2.5210 | 213  | 0.4081          |
-| 0.1525        | 2.5569 | 216  | 0.4073          |
-| 0.1897        | 2.5928 | 219  | 0.4069          |
-| 0.1785        | 2.6287 | 222  | 0.4064          |
-| 0.169         | 2.6647 | 225  | 0.4064          |
-| 0.1518        | 2.7006 | 228  | 0.4060          |
-| 0.1896        | 2.7365 | 231  | 0.4052          |
-| 0.1675        | 2.7725 | 234  | 0.4055          |
-| 0.2193        | 2.8084 | 237  | 0.4055          |
-| 0.1887        | 2.8443 | 240  | 0.4057          |
-| 0.1639        | 2.8802 | 243  | 0.4055          |
-| 0.1701        | 2.9162 | 246  | 0.4058          |
-| 0.2019        | 2.9521 | 249  | 0.4057          |
 ### Framework versions

 It achieves the following results on the evaluation set:
 - Loss: 0.4057
+## Training:
+It was trained on an **A40** for more than 1 hour with the above mentioned Axolotl yaml configurations.
 ### Training hyperparameters
 - lr_scheduler_warmup_steps: 100
 - num_epochs: 3
+The lose curves are as:
+![image/png](https://cdn-uploads.huggingface.co/production/uploads/66137d95e8d2cda230ddcea6/aUYWcsr8kT3khy6SsrkOd.png)
+![image/png](https://cdn-uploads.huggingface.co/production/uploads/66137d95e8d2cda230ddcea6/fHWzXAEEqc-fKAp5Ngpuz.png)
 ### Framework versions