PiGrieco commited on
Commit
387a0ff
1 Parent(s): 8a585f9

PiGrieco/Llama3-q4_k_m

Browse files
README.md CHANGED
@@ -1,12 +1,11 @@
1
  ---
2
- license: llama2
3
- library_name: peft
4
  tags:
5
- - trl
6
- - sft
7
- - unsloth
8
  - generated_from_trainer
9
- base_model: unsloth/llama-3-8b-bnb-4bit
 
 
10
  model-index:
11
  - name: Llama3-q4_k_m
12
  results: []
@@ -17,7 +16,11 @@ should probably proofread and complete it, then remove this comment. -->
17
 
18
  # Llama3-q4_k_m
19
 
20
- This model is a fine-tuned version of [unsloth/llama-3-8b-bnb-4bit](https://huggingface.co/unsloth/llama-3-8b-bnb-4bit) on an unknown dataset.
 
 
 
 
21
 
22
  ## Model description
23
 
@@ -36,21 +39,31 @@ More information needed
36
  ### Training hyperparameters
37
 
38
  The following hyperparameters were used during training:
39
- - learning_rate: 0.0002
40
- - train_batch_size: 2
41
  - eval_batch_size: 8
42
- - seed: 3407
43
- - gradient_accumulation_steps: 4
44
- - total_train_batch_size: 8
45
  - optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
46
  - lr_scheduler_type: linear
47
- - lr_scheduler_warmup_steps: 5
48
- - training_steps: 60
 
 
 
 
 
 
 
 
 
 
 
 
 
49
 
50
  ### Framework versions
51
 
52
- - PEFT 0.11.1
53
- - Transformers 4.41.0
54
  - Pytorch 2.3.0+cu121
55
- - Datasets 2.19.1
56
- - Tokenizers 0.19.1
 
1
  ---
2
+ license: mit
3
+ base_model: roberta-base
4
  tags:
 
 
 
5
  - generated_from_trainer
6
+ metrics:
7
+ - accuracy
8
+ - f1
9
  model-index:
10
  - name: Llama3-q4_k_m
11
  results: []
 
16
 
17
  # Llama3-q4_k_m
18
 
19
+ This model is a fine-tuned version of [roberta-base](https://huggingface.co/roberta-base) on the None dataset.
20
+ It achieves the following results on the evaluation set:
21
+ - Loss: 0.0938
22
+ - Accuracy: 0.9825
23
+ - F1: 0.9827
24
 
25
  ## Model description
26
 
 
39
  ### Training hyperparameters
40
 
41
  The following hyperparameters were used during training:
42
+ - learning_rate: 5e-05
43
+ - train_batch_size: 8
44
  - eval_batch_size: 8
45
+ - seed: 42
 
 
46
  - optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
47
  - lr_scheduler_type: linear
48
+ - num_epochs: 8
49
+
50
+ ### Training results
51
+
52
+ | Training Loss | Epoch | Step | Validation Loss | Accuracy | F1 |
53
+ |:-------------:|:-----:|:----:|:---------------:|:--------:|:------:|
54
+ | 0.3823 | 1.0 | 129 | 0.1932 | 0.9532 | 0.9535 |
55
+ | 0.1585 | 2.0 | 258 | 0.3872 | 0.8977 | 0.9057 |
56
+ | 0.3048 | 3.0 | 387 | 0.1816 | 0.9474 | 0.9477 |
57
+ | 0.2353 | 4.0 | 516 | 0.1817 | 0.9591 | 0.9605 |
58
+ | 0.2928 | 5.0 | 645 | 0.2058 | 0.9503 | 0.9524 |
59
+ | 0.2452 | 6.0 | 774 | 0.1246 | 0.9737 | 0.9742 |
60
+ | 0.348 | 7.0 | 903 | 0.0932 | 0.9825 | 0.9827 |
61
+ | 0.1316 | 8.0 | 1032 | 0.0938 | 0.9825 | 0.9827 |
62
+
63
 
64
  ### Framework versions
65
 
66
+ - Transformers 4.41.2
 
67
  - Pytorch 2.3.0+cu121
68
+ - Datasets 2.19.2
69
+ - Tokenizers 0.19.1
logs/events.out.tfevents.1717779800.e3a39cc1c013.424.3 ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:4c1f4034fe09049a8cfe01ce0367531f93eda09b20ff8b3d4cbc285cbd6d6ef4
3
+ size 29803
logs/events.out.tfevents.1717780430.e3a39cc1c013.424.4 ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:243abaf63319bb4e0f5ac52853e2d782112c7de8a7b5587f0ed7ee8464e3cc35
3
+ size 29894
logs/events.out.tfevents.1717781196.e3a39cc1c013.424.5 ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:777a1308f8a540ec39d89fec884ea0f50c6ebf36ab62959bbd912be8f70a67d6
3
+ size 29894
model.safetensors CHANGED
@@ -1,3 +1,3 @@
1
  version https://git-lfs.github.com/spec/v1
2
- oid sha256:a3ed68c11cc3249509e926842da977d0250b14d6c170a2fce9ddb41d24a6512a
3
  size 498612824
 
1
  version https://git-lfs.github.com/spec/v1
2
+ oid sha256:6f9595da539b149233c9546da281b84469159c4fbba67540a934f193344c394c
3
  size 498612824