robinsmits commited on
Commit
5eb7542
1 Parent(s): fc52456

update model card README.md

Browse files
Files changed (1) hide show
  1. README.md +15 -39
README.md CHANGED
@@ -1,22 +1,11 @@
1
  ---
2
- license: cc-by-nc-4.0
3
- inference: false
4
- datasets:
5
- - BramVanroy/alpaca-cleaned-dutch
6
  base_model: DAMO-NLP-MT/polylm-1.7b
7
  tags:
8
  - generated_from_trainer
9
- - alpaca
10
- - Transformers
11
- - PolyLM
12
- - text-generation-inference
13
  model-index:
14
  - name: polylm_1.7b_ft_alpaca_clean_dutch
15
  results: []
16
- language:
17
- - nl
18
- library_name: peft
19
- pipeline_tag: text-generation
20
  ---
21
 
22
  <!-- This model card has been generated automatically according to the information the Trainer had access to. You
@@ -26,7 +15,7 @@ should probably proofread and complete it, then remove this comment. -->
26
 
27
  This model is a fine-tuned version of [DAMO-NLP-MT/polylm-1.7b](https://huggingface.co/DAMO-NLP-MT/polylm-1.7b) on an unknown dataset.
28
  It achieves the following results on the evaluation set:
29
- - Loss: 1.8174
30
 
31
  ## Model description
32
 
@@ -42,21 +31,10 @@ More information needed
42
 
43
  ## Training procedure
44
 
45
-
46
- The following `bitsandbytes` quantization config was used during training:
47
- - load_in_8bit: False
48
- - load_in_4bit: True
49
- - llm_int8_threshold: 6.0
50
- - llm_int8_skip_modules: None
51
- - llm_int8_enable_fp32_cpu_offload: False
52
- - llm_int8_has_fp16_weight: False
53
- - bnb_4bit_quant_type: nf4
54
- - bnb_4bit_use_double_quant: True
55
- - bnb_4bit_compute_dtype: bfloat16
56
  ### Training hyperparameters
57
 
58
  The following hyperparameters were used during training:
59
- - learning_rate: 0.0002
60
  - train_batch_size: 8
61
  - eval_batch_size: 8
62
  - seed: 42
@@ -71,25 +49,23 @@ The following hyperparameters were used during training:
71
 
72
  | Training Loss | Epoch | Step | Validation Loss |
73
  |:-------------:|:-----:|:----:|:---------------:|
74
- | 2.0693 | 0.16 | 128 | 2.0915 |
75
- | 2.0029 | 0.33 | 256 | 2.0195 |
76
- | 2.0006 | 0.49 | 384 | 1.9779 |
77
- | 1.933 | 0.66 | 512 | 1.9409 |
78
- | 1.9532 | 0.82 | 640 | 1.9217 |
79
- | 1.8959 | 0.99 | 768 | 1.8978 |
80
- | 1.8237 | 1.15 | 896 | 1.8838 |
81
- | 1.8218 | 1.32 | 1024 | 1.8693 |
82
- | 1.8072 | 1.48 | 1152 | 1.8521 |
83
- | 1.8103 | 1.65 | 1280 | 1.8395 |
84
- | 1.8275 | 1.81 | 1408 | 1.8266 |
85
- | 1.7902 | 1.98 | 1536 | 1.8174 |
86
 
87
 
88
  ### Framework versions
89
 
90
- - PEFT 0.4.0
91
  - Transformers 4.31.0
92
  - Pytorch 2.0.1+cu118
93
  - Datasets 2.13.1
94
  - Tokenizers 0.13.3
95
- - PEFT 0.4.0
 
1
  ---
2
+ license: apache-2.0
 
 
 
3
  base_model: DAMO-NLP-MT/polylm-1.7b
4
  tags:
5
  - generated_from_trainer
 
 
 
 
6
  model-index:
7
  - name: polylm_1.7b_ft_alpaca_clean_dutch
8
  results: []
 
 
 
 
9
  ---
10
 
11
  <!-- This model card has been generated automatically according to the information the Trainer had access to. You
 
15
 
16
  This model is a fine-tuned version of [DAMO-NLP-MT/polylm-1.7b](https://huggingface.co/DAMO-NLP-MT/polylm-1.7b) on an unknown dataset.
17
  It achieves the following results on the evaluation set:
18
+ - Loss: 1.8483
19
 
20
  ## Model description
21
 
 
31
 
32
  ## Training procedure
33
 
 
 
 
 
 
 
 
 
 
 
 
34
  ### Training hyperparameters
35
 
36
  The following hyperparameters were used during training:
37
+ - learning_rate: 0.0001
38
  - train_batch_size: 8
39
  - eval_batch_size: 8
40
  - seed: 42
 
49
 
50
  | Training Loss | Epoch | Step | Validation Loss |
51
  |:-------------:|:-----:|:----:|:---------------:|
52
+ | 2.1248 | 0.16 | 128 | 2.1129 |
53
+ | 2.0512 | 0.33 | 256 | 2.0347 |
54
+ | 1.9983 | 0.49 | 384 | 1.9948 |
55
+ | 1.9557 | 0.66 | 512 | 1.9655 |
56
+ | 1.9583 | 0.82 | 640 | 1.9386 |
57
+ | 1.916 | 0.99 | 768 | 1.9177 |
58
+ | 1.8671 | 1.15 | 896 | 1.9019 |
59
+ | 1.8626 | 1.32 | 1024 | 1.8885 |
60
+ | 1.8321 | 1.48 | 1152 | 1.8762 |
61
+ | 1.8596 | 1.65 | 1280 | 1.8631 |
62
+ | 1.843 | 1.81 | 1408 | 1.8539 |
63
+ | 1.8333 | 1.98 | 1536 | 1.8483 |
64
 
65
 
66
  ### Framework versions
67
 
 
68
  - Transformers 4.31.0
69
  - Pytorch 2.0.1+cu118
70
  - Datasets 2.13.1
71
  - Tokenizers 0.13.3