frankmorales2020 commited on
Commit
a9e4095
1 Parent(s): abe7064

Model save

Browse files
Files changed (1) hide show
  1. README.md +7 -24
README.md CHANGED
@@ -22,8 +22,7 @@ This model is a fine-tuned version of [mistralai/Mistral-7B-Instruct-v0.1](https
22
 
23
  ## Model description
24
 
25
- https://ai.plainenglish.io/fine-tuning-the-mistral-7b-instruct-v0-1-model-with-aviationqa-dataset-06773d447c03
26
-
27
 
28
  ## Intended uses & limitations
29
 
@@ -31,23 +30,7 @@ More information needed
31
 
32
  ## Training and evaluation data
33
 
34
-
35
- The following hyperparameters were used during training:
36
- - learning_rate: 0.0002
37
- - train_batch_size: 3
38
- - eval_batch_size: 8
39
- - seed: 42
40
- - gradient_accumulation_steps: 2
41
- - total_train_batch_size: 6
42
- - optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
43
- - lr_scheduler_type: constant
44
- - lr_scheduler_warmup_ratio: 0.03
45
- - num_epochs: 1
46
-
47
- The accuracy of the match Accuracy (Eval dataset and predict) for a sample of 10,000 questions is 23.21%; the AviationQA dataset has 1 Million questions related to the aviation domain.
48
-
49
- The source code for the evaluation is here https://github.com/frank-morales2020/MLxDL/blob/main/FineTunning_Testing_For_AviationQADataset_.ipynb
50
-
51
 
52
  ## Training procedure
53
 
@@ -55,15 +38,15 @@ The source code for the evaluation is here https://github.com/frank-morales2020/
55
 
56
  The following hyperparameters were used during training:
57
  - learning_rate: 0.0002
58
- - train_batch_size: 3
59
- - eval_batch_size: 8
60
  - seed: 42
61
  - gradient_accumulation_steps: 2
62
- - total_train_batch_size: 6
63
  - optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
64
  - lr_scheduler_type: constant
65
  - lr_scheduler_warmup_ratio: 0.03
66
- - num_epochs: 1
67
 
68
  ### Training results
69
 
@@ -74,5 +57,5 @@ The following hyperparameters were used during training:
74
  - PEFT 0.11.1
75
  - Transformers 4.41.2
76
  - Pytorch 2.3.0+cu121
77
- - Datasets 2.19.2
78
  - Tokenizers 0.19.1
 
22
 
23
  ## Model description
24
 
25
+ More information needed
 
26
 
27
  ## Intended uses & limitations
28
 
 
30
 
31
  ## Training and evaluation data
32
 
33
+ More information needed
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
34
 
35
  ## Training procedure
36
 
 
38
 
39
  The following hyperparameters were used during training:
40
  - learning_rate: 0.0002
41
+ - train_batch_size: 2
42
+ - eval_batch_size: 4
43
  - seed: 42
44
  - gradient_accumulation_steps: 2
45
+ - total_train_batch_size: 4
46
  - optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
47
  - lr_scheduler_type: constant
48
  - lr_scheduler_warmup_ratio: 0.03
49
+ - num_epochs: 40
50
 
51
  ### Training results
52
 
 
57
  - PEFT 0.11.1
58
  - Transformers 4.41.2
59
  - Pytorch 2.3.0+cu121
60
+ - Datasets 2.20.0
61
  - Tokenizers 0.19.1