kaixkhazaki commited on
Commit
c41bb4e
·
verified ·
1 Parent(s): f2d860a

Pushing of the new best model checkpoint

Browse files
README.md CHANGED
@@ -4,16 +4,14 @@ license: mit
4
  base_model: deepset/gbert-large
5
  tags:
6
  - generated_from_trainer
 
 
 
 
 
7
  model-index:
8
  - name: german-zeroshot
9
  results: []
10
- datasets:
11
- - facebook/xnli
12
- language:
13
- - de
14
- metrics:
15
- - accuracy
16
- pipeline_tag: zero-shot-classification
17
  ---
18
 
19
  <!-- This model card has been generated automatically according to the information the Trainer had access to. You
@@ -21,60 +19,13 @@ should probably proofread and complete it, then remove this comment. -->
21
 
22
  # german-zeroshot
23
 
24
- This model is a fine-tuned version of [deepset/gbert-large](https://huggingface.co/deepset/gbert-large) on facebook/xnli de dataset.
25
  It achieves the following results on the evaluation set:
26
- - eval_loss: 0.5051
27
- - eval_accuracy: 0.8096
28
- - eval_f1: 0.8102
29
- - eval_precision: 0.8131
30
- - eval_recall: 0.8096
31
- - eval_runtime: 5.9824
32
- - eval_samples_per_second: 416.224
33
- - eval_steps_per_second: 13.038
34
- - epoch: 0.4889
35
- - step: 3000
36
-
37
- ```python
38
- # Use a pipeline as a high-level helper
39
-
40
- pipe = pipeline(
41
- "zero-shot-classification",
42
- model="kaixkhazaki/german-zeroshot",
43
- tokenizer="kaixkhazaki/german-zeroshot",
44
- device=0 if torch.cuda.is_available() else -1 # Use GPU if available
45
- )
46
-
47
- #Enter your text and possible candidates of classification
48
- sequence = "Können Sie mir die Schritte zur Konfiguration eines VPN auf einem Linux-Server erklären?"
49
- candidate_labels = [
50
- "Technische Dokumentation",
51
- "IT-Support",
52
- "Netzwerkadministration",
53
- "Linux-Konfiguration",
54
- "VPN-Setup"
55
- ]
56
- pipe(sequence,candidate_labels)
57
- >>
58
- {'sequence': 'Können Sie mir die Schritte zur Konfiguration eines VPN auf einem Linux-Server erklären?',
59
- 'labels': ['VPN-Setup', 'Linux-Konfiguration', 'Netzwerkadministration', 'IT-Support', 'Technische Dokumentation'],
60
- 'scores': [0.3245040476322174, 0.32373329997062683, 0.16423103213310242, 0.09850211441516876, 0.08902951329946518]}
61
-
62
- #example 2
63
- sequence = "Können Sie mir die Schritte zur Konfiguration eines VPN auf einem Linux-Server erklären?"
64
- candidate_labels = [
65
- "Technische Dokumentation",
66
- "IT-Support",
67
- "Netzwerkadministration",
68
- "Linux-Konfiguration",
69
- "VPN-Setup"
70
- ]
71
- pipe(sequence,candidate_labels)
72
- >>
73
- {'sequence': 'Wie lautet die Garantiezeit für dieses Produkt?',
74
- 'labels': ['Garantiebedingungen', 'Produktdetails', 'Reklamation', 'Kundendienst', 'Kaufberatung'],
75
- 'scores': [0.4313304126262665, 0.2905466556549072, 0.10058070719242096, 0.09384352713823318, 0.08369863778352737]}
76
-
77
- ```
78
 
79
  ## Model description
80
 
@@ -102,9 +53,33 @@ The following hyperparameters were used during training:
102
  - lr_scheduler_warmup_steps: 500
103
  - num_epochs: 3
104
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
105
  ### Framework versions
106
 
107
  - Transformers 4.48.0.dev0
108
  - Pytorch 2.4.1+cu121
109
  - Datasets 3.1.0
110
- - Tokenizers 0.21.0
 
4
  base_model: deepset/gbert-large
5
  tags:
6
  - generated_from_trainer
7
+ metrics:
8
+ - accuracy
9
+ - f1
10
+ - precision
11
+ - recall
12
  model-index:
13
  - name: german-zeroshot
14
  results: []
 
 
 
 
 
 
 
15
  ---
16
 
17
  <!-- This model card has been generated automatically according to the information the Trainer had access to. You
 
19
 
20
  # german-zeroshot
21
 
22
+ This model is a fine-tuned version of [deepset/gbert-large](https://huggingface.co/deepset/gbert-large) on an unknown dataset.
23
  It achieves the following results on the evaluation set:
24
+ - Loss: 0.4592
25
+ - Accuracy: 0.8486
26
+ - F1: 0.8487
27
+ - Precision: 0.8505
28
+ - Recall: 0.8486
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
29
 
30
  ## Model description
31
 
 
53
  - lr_scheduler_warmup_steps: 500
54
  - num_epochs: 3
55
 
56
+ ### Training results
57
+
58
+ | Training Loss | Epoch | Step | Validation Loss | Accuracy | F1 | Precision | Recall |
59
+ |:-------------:|:------:|:-----:|:---------------:|:--------:|:------:|:---------:|:------:|
60
+ | 0.6429 | 0.1630 | 1000 | 0.5203 | 0.8004 | 0.8006 | 0.8009 | 0.8004 |
61
+ | 0.5715 | 0.3259 | 2000 | 0.5209 | 0.7964 | 0.7968 | 0.8005 | 0.7964 |
62
+ | 0.5897 | 0.4889 | 3000 | 0.5435 | 0.7924 | 0.7940 | 0.8039 | 0.7924 |
63
+ | 0.5701 | 0.6519 | 4000 | 0.5242 | 0.7880 | 0.7884 | 0.8078 | 0.7880 |
64
+ | 0.5238 | 0.8149 | 5000 | 0.4816 | 0.8233 | 0.8226 | 0.8263 | 0.8233 |
65
+ | 0.5285 | 0.9778 | 6000 | 0.4483 | 0.8265 | 0.8273 | 0.8303 | 0.8265 |
66
+ | 0.4302 | 1.1408 | 7000 | 0.4751 | 0.8209 | 0.8214 | 0.8277 | 0.8209 |
67
+ | 0.4163 | 1.3038 | 8000 | 0.4560 | 0.8285 | 0.8289 | 0.8344 | 0.8285 |
68
+ | 0.3942 | 1.4668 | 9000 | 0.4330 | 0.8414 | 0.8422 | 0.8454 | 0.8414 |
69
+ | 0.3875 | 1.6297 | 10000 | 0.4171 | 0.8430 | 0.8432 | 0.8455 | 0.8430 |
70
+ | 0.3639 | 1.7927 | 11000 | 0.4194 | 0.8442 | 0.8447 | 0.8487 | 0.8442 |
71
+ | 0.3768 | 1.9557 | 12000 | 0.4215 | 0.8474 | 0.8477 | 0.8492 | 0.8474 |
72
+ | 0.2443 | 2.1186 | 13000 | 0.4750 | 0.8390 | 0.8398 | 0.8452 | 0.8390 |
73
+ | 0.2404 | 2.2816 | 14000 | 0.4592 | 0.8486 | 0.8487 | 0.8505 | 0.8486 |
74
+ | 0.2154 | 2.4446 | 15000 | 0.4914 | 0.8418 | 0.8424 | 0.8466 | 0.8418 |
75
+ | 0.2157 | 2.6076 | 16000 | 0.4804 | 0.8454 | 0.8458 | 0.8488 | 0.8454 |
76
+ | 0.2249 | 2.7705 | 17000 | 0.4809 | 0.8466 | 0.8471 | 0.8507 | 0.8466 |
77
+ | 0.2204 | 2.9335 | 18000 | 0.4777 | 0.8466 | 0.8470 | 0.8502 | 0.8466 |
78
+
79
+
80
  ### Framework versions
81
 
82
  - Transformers 4.48.0.dev0
83
  - Pytorch 2.4.1+cu121
84
  - Datasets 3.1.0
85
+ - Tokenizers 0.21.0
model.safetensors CHANGED
@@ -1,3 +1,3 @@
1
  version https://git-lfs.github.com/spec/v1
2
- oid sha256:84f55d8a2ff72e445e0a9b0c55f7a6374e00809bd8c71bce3a8162bf9798f90b
3
  size 1343002540
 
1
  version https://git-lfs.github.com/spec/v1
2
+ oid sha256:16209146102c1e03cac27a452fdaea9c6c349d804512591e46854b648865d495
3
  size 1343002540
runs/Jan09_18-35-03_ip-10-10-13-247.eu-central-1.compute.internal/events.out.tfevents.1736447705.ip-10-10-13-247.eu-central-1.compute.internal.16223.0 ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:76c592b4faeff148d0a01805ebe7f1df0fe2a3545c8dfd2b70b961c25fdc5b3b
3
+ size 91907
runs/Jan09_18-35-03_ip-10-10-13-247.eu-central-1.compute.internal/events.out.tfevents.1736457311.ip-10-10-13-247.eu-central-1.compute.internal.16223.1 ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:5a98640eb42cbe741fdda729e14bebb08f471c423b79e166ee6b7ff8d23f0946
3
+ size 569
tokenizer.json CHANGED
@@ -1,21 +1,7 @@
1
  {
2
  "version": "1.0",
3
- "truncation": {
4
- "direction": "Right",
5
- "max_length": 128,
6
- "strategy": "LongestFirst",
7
- "stride": 0
8
- },
9
- "padding": {
10
- "strategy": {
11
- "Fixed": 128
12
- },
13
- "direction": "Right",
14
- "pad_to_multiple_of": null,
15
- "pad_id": 0,
16
- "pad_type_id": 0,
17
- "pad_token": "[PAD]"
18
- },
19
  "added_tokens": [
20
  {
21
  "id": 0,
 
1
  {
2
  "version": "1.0",
3
+ "truncation": null,
4
+ "padding": null,
 
 
 
 
 
 
 
 
 
 
 
 
 
 
5
  "added_tokens": [
6
  {
7
  "id": 0,
training_args.bin CHANGED
@@ -1,3 +1,3 @@
1
  version https://git-lfs.github.com/spec/v1
2
- oid sha256:58ebd655a9e7d72b1516168e76c01a770c8396868835cbe1c2abf64d8f7c1917
3
  size 5432
 
1
  version https://git-lfs.github.com/spec/v1
2
+ oid sha256:7039d6e6f69534b56d6691a36fb45774864d8997b5d35932ef71c5d15ea0462a
3
  size 5432