File size: 4,539 Bytes
8df4359
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
9f7497c
8df4359
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
ff36ed7
 
8df4359
 
 
9f7497c
8df4359
 
 
9f7497c
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
8df4359
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
---
license: mit
base_model: microsoft/kosmos-2-patch14-224
tags:
- generated_from_trainer
model-index:
- name: kosm-checkpoint
  results: []
---

<!-- This model card has been generated automatically according to the information the Trainer had access to. You
should probably proofread and complete it, then remove this comment. -->

# kosm-checkpoint

This model is a fine-tuned version of [microsoft/kosmos-2-patch14-224](https://huggingface.co/microsoft/kosmos-2-patch14-224) on the None dataset.
It achieves the following results on the evaluation set:
- Loss: 0.0340

## Model description

More information needed

## Intended uses & limitations

More information needed

## Training and evaluation data

More information needed

## Training procedure

### Training hyperparameters

The following hyperparameters were used during training:
- learning_rate: 1e-05
- train_batch_size: 2
- eval_batch_size: 2
- seed: 42
- gradient_accumulation_steps: 4
- total_train_batch_size: 8
- optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
- lr_scheduler_type: linear
- lr_scheduler_warmup_ratio: 0.1
- num_epochs: 3.0

### Training results

| Training Loss | Epoch  | Step  | Validation Loss |
|:-------------:|:------:|:-----:|:---------------:|
| No log        | 0.0497 | 200   | 0.0700          |
| 0.0802        | 0.0993 | 400   | 0.0581          |
| 0.0676        | 0.1490 | 600   | 0.0496          |
| 0.0584        | 0.1986 | 800   | 0.0450          |
| 0.0582        | 0.2483 | 1000  | 0.0481          |
| 0.0582        | 0.2979 | 1200  | 0.0486          |
| 0.0572        | 0.3476 | 1400  | 0.0445          |
| 0.0537        | 0.3972 | 1600  | 0.0463          |
| 0.0504        | 0.4469 | 1800  | 0.0421          |
| 0.0473        | 0.4965 | 2000  | 0.0402          |
| 0.0473        | 0.5462 | 2200  | 0.0423          |
| 0.046         | 0.5958 | 2400  | 0.0394          |
| 0.0448        | 0.6455 | 2600  | 0.0369          |
| 0.0423        | 0.6951 | 2800  | 0.0378          |
| 0.0403        | 0.7448 | 3000  | 0.0360          |
| 0.0403        | 0.7944 | 3200  | 0.0364          |
| 0.0392        | 0.8441 | 3400  | 0.0352          |
| 0.0388        | 0.8937 | 3600  | 0.0347          |
| 0.0375        | 0.9434 | 3800  | 0.0343          |
| 0.037         | 0.9930 | 4000  | 0.0345          |
| 0.037         | 1.0427 | 4200  | 0.0355          |
| 0.03          | 1.0924 | 4400  | 0.0338          |
| 0.0283        | 1.1420 | 4600  | 0.0349          |
| 0.0281        | 1.1917 | 4800  | 0.0347          |
| 0.0288        | 1.2413 | 5000  | 0.0322          |
| 0.0288        | 1.2910 | 5200  | 0.0331          |
| 0.0279        | 1.3406 | 5400  | 0.0335          |
| 0.0272        | 1.3903 | 5600  | 0.0322          |
| 0.0275        | 1.4399 | 5800  | 0.0338          |
| 0.0271        | 1.4896 | 6000  | 0.0324          |
| 0.0271        | 1.5392 | 6200  | 0.0324          |
| 0.0263        | 1.5889 | 6400  | 0.0320          |
| 0.0262        | 1.6385 | 6600  | 0.0319          |
| 0.0264        | 1.6882 | 6800  | 0.0317          |
| 0.0256        | 1.7378 | 7000  | 0.0322          |
| 0.0256        | 1.7875 | 7200  | 0.0320          |
| 0.0255        | 1.8371 | 7400  | 0.0316          |
| 0.0242        | 1.8868 | 7600  | 0.0327          |
| 0.0262        | 1.9364 | 7800  | 0.0307          |
| 0.0252        | 1.9861 | 8000  | 0.0304          |
| 0.0252        | 2.0357 | 8200  | 0.0343          |
| 0.0173        | 2.0854 | 8400  | 0.0373          |
| 0.0148        | 2.1351 | 8600  | 0.0345          |
| 0.015         | 2.1847 | 8800  | 0.0347          |
| 0.0148        | 2.2344 | 9000  | 0.0347          |
| 0.0148        | 2.2840 | 9200  | 0.0354          |
| 0.0132        | 2.3337 | 9400  | 0.0351          |
| 0.0136        | 2.3833 | 9600  | 0.0362          |
| 0.0132        | 2.4330 | 9800  | 0.0360          |
| 0.0138        | 2.4826 | 10000 | 0.0352          |
| 0.0138        | 2.5323 | 10200 | 0.0359          |
| 0.0138        | 2.5819 | 10400 | 0.0348          |
| 0.0132        | 2.6316 | 10600 | 0.0348          |
| 0.0129        | 2.6812 | 10800 | 0.0337          |
| 0.0134        | 2.7309 | 11000 | 0.0354          |
| 0.0134        | 2.7805 | 11200 | 0.0350          |
| 0.0132        | 2.8302 | 11400 | 0.0351          |
| 0.0128        | 2.8798 | 11600 | 0.0350          |
| 0.013         | 2.9295 | 11800 | 0.0339          |
| 0.012         | 2.9791 | 12000 | 0.0340          |


### Framework versions

- Transformers 4.42.4
- Pytorch 2.1.2+cu121
- Datasets 2.15.0
- Tokenizers 0.19.1