yigagilbert
/

salt_language_ID

+---
+license: apache-2.0
+base_model: google/t5-efficient-tiny
+tags:
+- generated_from_trainer
+model-index:
+- name: salt_language_ID
+  results: []
+---
+<!-- This model card has been generated automatically according to the information the Trainer had access to. You
+should probably proofread and complete it, then remove this comment. -->
+# salt_language_ID
+This model is a fine-tuned version of [google/t5-efficient-tiny](https://huggingface.co/google/t5-efficient-tiny) on an unknown dataset.
+It achieves the following results on the evaluation set:
+- Loss: 0.0523
+## Model description
+More information needed
+## Intended uses & limitations
+More information needed
+## Training and evaluation data
+More information needed
+## Training procedure
+### Training hyperparameters
+The following hyperparameters were used during training:
+- learning_rate: 0.0002
+- train_batch_size: 32
+- eval_batch_size: 16
+- seed: 42
+- gradient_accumulation_steps: 2
+- total_train_batch_size: 64
+- optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
+- lr_scheduler_type: linear
+- lr_scheduler_warmup_steps: 500
+- num_epochs: 1
+### Training results
+| Training Loss | Epoch | Step | Validation Loss |
+|:-------------:|:-----:|:----:|:---------------:|
+| 290.8962      | 0.01  | 10   | 305.3102        |
+| 294.9642      | 0.01  | 20   | 296.0765        |
+| 281.0769      | 0.02  | 30   | 279.4232        |
+| 270.1917      | 0.03  | 40   | 254.2555        |
+| 243.3464      | 0.04  | 50   | 224.8921        |
+| 214.4873      | 0.04  | 60   | 196.3753        |
+| 187.9743      | 0.05  | 70   | 135.2601        |
+| 164.024       | 0.06  | 80   | 105.8822        |
+| 143.6121      | 0.06  | 90   | 90.4786         |
+| 126.019       | 0.07  | 100  | 75.9475         |
+| 111.2358      | 0.08  | 110  | 64.2843         |
+| 96.1458       | 0.09  | 120  | 51.4159         |
+| 83.5523       | 0.09  | 130  | 37.5541         |
+| 69.0164       | 0.1   | 140  | 24.2653         |
+| 55.3427       | 0.11  | 150  | 19.5405         |
+| 38.9215       | 0.11  | 160  | 12.6305         |
+| 28.0225       | 0.12  | 170  | 10.0512         |
+| 21.42         | 0.13  | 180  | 6.2506          |
+| 15.1783       | 0.14  | 190  | 3.1231          |
+| 11.7336       | 0.14  | 200  | 1.5384          |
+| 9.3141        | 0.15  | 210  | 1.0360          |
+| 6.9583        | 0.16  | 220  | 0.5647          |
+| 5.7743        | 0.16  | 230  | 0.5395          |
+| 4.7486        | 0.17  | 240  | 0.5611          |
+| 3.7387        | 0.18  | 250  | 0.4738          |
+| 3.3398        | 0.19  | 260  | 0.4057          |
+| 3.1383        | 0.19  | 270  | 0.7111          |
+| 2.5906        | 0.2   | 280  | 0.3963          |
+| 2.4711        | 0.21  | 290  | 0.6025          |
+| 2.0874        | 0.21  | 300  | 0.4192          |
+| 2.2178        | 0.22  | 310  | 0.5130          |
+| 1.9783        | 0.23  | 320  | 0.2481          |
+| 1.9655        | 0.24  | 330  | 0.2947          |
+| 1.677         | 0.24  | 340  | 0.1795          |
+| 1.5847        | 0.25  | 350  | 0.4913          |
+| 1.6727        | 0.26  | 360  | 0.3358          |
+| 1.5304        | 0.26  | 370  | 0.4296          |
+| 1.4964        | 0.27  | 380  | 0.1527          |
+| 1.3643        | 0.28  | 390  | 0.4387          |
+| 1.1374        | 0.29  | 400  | 0.1458          |
+| 1.0719        | 0.29  | 410  | 0.1550          |
+| 1.2705        | 0.3   | 420  | 0.3249          |
+| 0.863         | 0.31  | 430  | 0.1285          |
+| 0.9644        | 0.31  | 440  | 0.2107          |
+| 0.9679        | 0.32  | 450  | 0.1729          |
+| 0.9753        | 0.33  | 460  | 0.2159          |
+| 0.7938        | 0.33  | 470  | 0.3218          |
+| 0.739         | 0.34  | 480  | 0.1385          |
+| 0.6355        | 0.35  | 490  | 0.4408          |
+| 0.8578        | 0.36  | 500  | 0.1109          |
+| 0.758         | 0.36  | 510  | 0.1393          |
+| 0.5958        | 0.37  | 520  | 0.1510          |
+| 0.591         | 0.38  | 530  | 0.1199          |
+| 0.6605        | 0.38  | 540  | 0.2369          |
+| 0.6788        | 0.39  | 550  | 0.1469          |
+| 0.5533        | 0.4   | 560  | 0.1505          |
+| 0.517         | 0.41  | 570  | 0.1881          |
+| 0.5682        | 0.41  | 580  | 0.1808          |
+| 0.592         | 0.42  | 590  | 0.2145          |
+| 0.4944        | 0.43  | 600  | 0.0838          |
+| 0.4257        | 0.43  | 610  | 0.2627          |
+| 0.4852        | 0.44  | 620  | 0.1266          |
+| 0.457         | 0.45  | 630  | 0.1091          |
+| 0.5455        | 0.46  | 640  | 0.1685          |
+| 0.5514        | 0.46  | 650  | 0.0978          |
+| 0.5117        | 0.47  | 660  | 0.1249          |
+| 0.4067        | 0.48  | 670  | 0.0845          |
+| 0.3931        | 0.48  | 680  | 0.1590          |
+| 0.4459        | 0.49  | 690  | 0.1579          |
+| 0.3834        | 0.5   | 700  | 0.0723          |
+| 0.4204        | 0.51  | 710  | 0.1741          |
+| 0.3586        | 0.51  | 720  | 0.0758          |
+| 0.4346        | 0.52  | 730  | 0.1412          |
+| 0.3643        | 0.53  | 740  | 0.0911          |
+| 0.3513        | 0.53  | 750  | 0.1685          |
+| 0.3544        | 0.54  | 760  | 0.0671          |
+| 0.3327        | 0.55  | 770  | 0.2027          |
+| 0.3361        | 0.56  | 780  | 0.0781          |
+| 0.2818        | 0.56  | 790  | 0.1254          |
+| 0.3959        | 0.57  | 800  | 0.0609          |
+| 0.3416        | 0.58  | 810  | 0.2034          |
+| 0.3283        | 0.58  | 820  | 0.0913          |
+| 0.2884        | 0.59  | 830  | 0.1646          |
+| 0.3753        | 0.6   | 840  | 0.1290          |
+| 0.3436        | 0.61  | 850  | 0.1186          |
+| 0.2789        | 0.61  | 860  | 0.1103          |
+| 0.3437        | 0.62  | 870  | 0.0744          |
+| 0.2594        | 0.63  | 880  | 0.0608          |
+| 0.3105        | 0.63  | 890  | 0.0623          |
+| 0.3247        | 0.64  | 900  | 0.0801          |
+| 0.3111        | 0.65  | 910  | 0.0721          |
+| 0.2956        | 0.66  | 920  | 0.0617          |
+| 0.2971        | 0.66  | 930  | 0.1223          |
+| 0.2329        | 0.67  | 940  | 0.0652          |
+| 0.2615        | 0.68  | 950  | 0.0768          |
+| 0.2812        | 0.68  | 960  | 0.0735          |
+| 0.2756        | 0.69  | 970  | 0.0572          |
+| 0.201         | 0.7   | 980  | 0.0582          |
+| 0.2462        | 0.71  | 990  | 0.0830          |
+| 0.2597        | 0.71  | 1000 | 0.0510          |
+| 0.2546        | 0.72  | 1010 | 0.0990          |
+| 0.2765        | 0.73  | 1020 | 0.0718          |
+| 0.1772        | 0.73  | 1030 | 0.0505          |
+| 0.2669        | 0.74  | 1040 | 0.0916          |
+| 0.2215        | 0.75  | 1050 | 0.0609          |
+| 0.2295        | 0.76  | 1060 | 0.0799          |
+| 0.2413        | 0.76  | 1070 | 0.0573          |
+| 0.2022        | 0.77  | 1080 | 0.0636          |
+| 0.2539        | 0.78  | 1090 | 0.1083          |
+| 0.2061        | 0.78  | 1100 | 0.0570          |
+| 0.2561        | 0.79  | 1110 | 0.0522          |
+| 0.2606        | 0.8   | 1120 | 0.0721          |
+| 0.2368        | 0.81  | 1130 | 0.0514          |
+| 0.2789        | 0.81  | 1140 | 0.0557          |
+| 0.1802        | 0.82  | 1150 | 0.0945          |
+| 0.27          | 0.83  | 1160 | 0.0987          |
+| 0.2445        | 0.83  | 1170 | 0.0521          |
+| 0.1911        | 0.84  | 1180 | 0.0461          |
+| 0.2336        | 0.85  | 1190 | 0.0457          |
+| 0.1881        | 0.86  | 1200 | 0.0653          |
+| 0.2039        | 0.86  | 1210 | 0.0821          |
+| 0.2503        | 0.87  | 1220 | 0.0493          |
+| 0.2153        | 0.88  | 1230 | 0.0492          |
+| 0.2307        | 0.88  | 1240 | 0.0691          |
+| 0.1854        | 0.89  | 1250 | 0.0486          |
+| 0.1896        | 0.9   | 1260 | 0.0575          |
+| 0.2321        | 0.91  | 1270 | 0.0546          |
+| 0.195         | 0.91  | 1280 | 0.0506          |
+| 0.2457        | 0.92  | 1290 | 0.0556          |
+| 0.15          | 0.93  | 1300 | 0.0525          |
+| 0.2567        | 0.93  | 1310 | 0.0499          |
+| 0.2352        | 0.94  | 1320 | 0.0524          |
+| 0.2018        | 0.95  | 1330 | 0.0581          |
+| 0.1591        | 0.96  | 1340 | 0.0549          |
+| 0.2454        | 0.96  | 1350 | 0.0534          |
+| 0.1595        | 0.97  | 1360 | 0.0525          |
+| 0.179         | 0.98  | 1370 | 0.0524          |
+| 0.19          | 0.98  | 1380 | 0.0512          |
+| 0.1571        | 0.99  | 1390 | 0.0515          |
+| 0.1611        | 1.0   | 1400 | 0.0522          |
+### Framework versions
+- Transformers 4.38.2
+- Pytorch 2.1.2
+- Datasets 2.1.0
+- Tokenizers 0.15.2

generation_config.json ADDED Viewed

	@@ -0,0 +1,7 @@

+{
+  "_from_model_config": true,
+  "decoder_start_token_id": 0,
+  "eos_token_id": 1,
+  "pad_token_id": 0,
+  "transformers_version": "4.38.2"
+}

model.safetensors CHANGED Viewed

@@ -1,3 +1,3 @@
 version https://git-lfs.github.com/spec/v1
-oid sha256:13a487cfa526bb2ba24445ddda603c8a6395d1d207a7d3256262ea6538d8dd7f
 size 95192240

 version https://git-lfs.github.com/spec/v1
+oid sha256:36e3cc1eecef69fc99b260097eea577c799d268e50a24ef67ab4aa4c20da7aa6
 size 95192240