cassador commited on
Commit
3dbd92b
1 Parent(s): 3195c55

first commit

Browse files
1_Pooling/config.json ADDED
@@ -0,0 +1,10 @@
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "word_embedding_dimension": 768,
3
+ "pooling_mode_cls_token": false,
4
+ "pooling_mode_mean_tokens": true,
5
+ "pooling_mode_max_tokens": false,
6
+ "pooling_mode_mean_sqrt_len_tokens": false,
7
+ "pooling_mode_weightedmean_tokens": false,
8
+ "pooling_mode_lasttoken": false,
9
+ "include_prompt": true
10
+ }
README.md ADDED
@@ -0,0 +1,446 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ language: []
3
+ library_name: sentence-transformers
4
+ tags:
5
+ - sentence-transformers
6
+ - sentence-similarity
7
+ - feature-extraction
8
+ - generated_from_trainer
9
+ - dataset_size:10330
10
+ - loss:MultipleNegativesRankingLoss
11
+ base_model: indobenchmark/indobert-base-p2
12
+ datasets: []
13
+ metrics:
14
+ - pearson_cosine
15
+ - spearman_cosine
16
+ - pearson_manhattan
17
+ - spearman_manhattan
18
+ - pearson_euclidean
19
+ - spearman_euclidean
20
+ - pearson_dot
21
+ - spearman_dot
22
+ - pearson_max
23
+ - spearman_max
24
+ widget:
25
+ - source_sentence: Dari hasil pernikahan keduanya, Desy dikaruniai seorang anak, yaitu
26
+ Nasywa Nathania Hamzah yang lahir pada tanggal 21 Juli 2002.
27
+ sentences:
28
+ - Aturan baru ini membuat pemain bertubuh tinggi kesulitan.
29
+ - Penderita diabetes hendaknya mengonsumsi makanan dengan indeks glikemik rendah.
30
+ - Seorang anak bernama Nasywa Nathania Hamzah lahir pada tanggal 21 Juli 2002.
31
+ - source_sentence: Marini pun langsung mengabadikan momen saat melihat bayi Iyek dan
32
+ Dewi tersebut, ke akun Instagramnya.
33
+ sentences:
34
+ - Kurang dari 500 orang terjangkit virus Corona di Indonesia.
35
+ - Marini memiliki akun instagram.
36
+ - Spesies ini pernah dipublikasikan sebelum 1753.
37
+ - source_sentence: Tujuan kami sebenarnya adalah strategi untuk melindungi kulit dari
38
+ radiasi UV dan kanker.
39
+ sentences:
40
+ - Pendapatan Airbnb berkurang sebelum Airbnb mengumumkan protokol kebersihan baru.
41
+ - Kami membuat strategi ini karena tingginya pasien kanker kulit.
42
+ - Terdapat distrik Faro di Portugis.
43
+ - source_sentence: Pembahasan tentang mitos gerhana bulan sebenarnya sudah terjadi
44
+ sejak dulu. Ada yang percaya gerhana bulan berpengaruh pada bumi bahkan kesehatan
45
+ tubuh.
46
+ sentences:
47
+ - Menonton orang main game seru.
48
+ - Orang yang terkaya ke-35 di Indonesia adalah seorang pria.
49
+ - Mitos gerhana bulan berpengaruh pada kesehatan tubuh.
50
+ - source_sentence: Waduk wadaslintang sebenarnya terbagi menjadi dua kabupaten yaitu
51
+ kabupaten kebumen dan kabupaten wonosobo.
52
+ sentences:
53
+ - Amir menjadi pemimpin sayap kiri terdepan ketika Revolusi Nasional Indonesia berlangsung.
54
+ - Musim ini di ajang PBL 2020 Hendra melawan tim Pune 7 aces.
55
+ - Kabupaten kebumen dan kabupaten wonosobo bertentaggaan.
56
+ pipeline_tag: sentence-similarity
57
+ model-index:
58
+ - name: SentenceTransformer based on indobenchmark/indobert-base-p2
59
+ results:
60
+ - task:
61
+ type: semantic-similarity
62
+ name: Semantic Similarity
63
+ dataset:
64
+ name: sts dev
65
+ type: sts-dev
66
+ metrics:
67
+ - type: pearson_cosine
68
+ value: -0.051616661741529624
69
+ name: Pearson Cosine
70
+ - type: spearman_cosine
71
+ value: -0.059260236757554256
72
+ name: Spearman Cosine
73
+ - type: pearson_manhattan
74
+ value: -0.06426082223860986
75
+ name: Pearson Manhattan
76
+ - type: spearman_manhattan
77
+ value: -0.06596359759097158
78
+ name: Spearman Manhattan
79
+ - type: pearson_euclidean
80
+ value: -0.06368615893415144
81
+ name: Pearson Euclidean
82
+ - type: spearman_euclidean
83
+ value: -0.06528449816144678
84
+ name: Spearman Euclidean
85
+ - type: pearson_dot
86
+ value: -0.027898791319537007
87
+ name: Pearson Dot
88
+ - type: spearman_dot
89
+ value: -0.02595347491107127
90
+ name: Spearman Dot
91
+ - type: pearson_max
92
+ value: -0.027898791319537007
93
+ name: Pearson Max
94
+ - type: spearman_max
95
+ value: -0.02595347491107127
96
+ name: Spearman Max
97
+ ---
98
+
99
+ # SentenceTransformer based on indobenchmark/indobert-base-p2
100
+
101
+ This is a [sentence-transformers](https://www.SBERT.net) model finetuned from [indobenchmark/indobert-base-p2](https://huggingface.co/indobenchmark/indobert-base-p2). It maps sentences & paragraphs to a 768-dimensional dense vector space and can be used for semantic textual similarity, semantic search, paraphrase mining, text classification, clustering, and more.
102
+
103
+ ## Model Details
104
+
105
+ ### Model Description
106
+ - **Model Type:** Sentence Transformer
107
+ - **Base model:** [indobenchmark/indobert-base-p2](https://huggingface.co/indobenchmark/indobert-base-p2) <!-- at revision 94b4e0a82081fa57f227fcc2024d1ea89b57ac1f -->
108
+ - **Maximum Sequence Length:** 200 tokens
109
+ - **Output Dimensionality:** 768 tokens
110
+ - **Similarity Function:** Cosine Similarity
111
+ <!-- - **Training Dataset:** Unknown -->
112
+ <!-- - **Language:** Unknown -->
113
+ <!-- - **License:** Unknown -->
114
+
115
+ ### Model Sources
116
+
117
+ - **Documentation:** [Sentence Transformers Documentation](https://sbert.net)
118
+ - **Repository:** [Sentence Transformers on GitHub](https://github.com/UKPLab/sentence-transformers)
119
+ - **Hugging Face:** [Sentence Transformers on Hugging Face](https://huggingface.co/models?library=sentence-transformers)
120
+
121
+ ### Full Model Architecture
122
+
123
+ ```
124
+ SentenceTransformer(
125
+ (0): Transformer({'max_seq_length': 200, 'do_lower_case': False}) with Transformer model: BertModel
126
+ (1): Pooling({'word_embedding_dimension': 768, 'pooling_mode_cls_token': False, 'pooling_mode_mean_tokens': True, 'pooling_mode_max_tokens': False, 'pooling_mode_mean_sqrt_len_tokens': False, 'pooling_mode_weightedmean_tokens': False, 'pooling_mode_lasttoken': False, 'include_prompt': True})
127
+ )
128
+ ```
129
+
130
+ ## Usage
131
+
132
+ ### Direct Usage (Sentence Transformers)
133
+
134
+ First install the Sentence Transformers library:
135
+
136
+ ```bash
137
+ pip install -U sentence-transformers
138
+ ```
139
+
140
+ Then you can load this model and run inference.
141
+ ```python
142
+ from sentence_transformers import SentenceTransformer
143
+
144
+ # Download from the 🤗 Hub
145
+ model = SentenceTransformer("sentence_transformers_model_id")
146
+ # Run inference
147
+ sentences = [
148
+ 'Waduk wadaslintang sebenarnya terbagi menjadi dua kabupaten yaitu kabupaten kebumen dan kabupaten wonosobo.',
149
+ 'Kabupaten kebumen dan kabupaten wonosobo bertentaggaan.',
150
+ 'Musim ini di ajang PBL 2020 Hendra melawan tim Pune 7 aces.',
151
+ ]
152
+ embeddings = model.encode(sentences)
153
+ print(embeddings.shape)
154
+ # [3, 768]
155
+
156
+ # Get the similarity scores for the embeddings
157
+ similarities = model.similarity(embeddings, embeddings)
158
+ print(similarities.shape)
159
+ # [3, 3]
160
+ ```
161
+
162
+ <!--
163
+ ### Direct Usage (Transformers)
164
+
165
+ <details><summary>Click to see the direct usage in Transformers</summary>
166
+
167
+ </details>
168
+ -->
169
+
170
+ <!--
171
+ ### Downstream Usage (Sentence Transformers)
172
+
173
+ You can finetune this model on your own dataset.
174
+
175
+ <details><summary>Click to expand</summary>
176
+
177
+ </details>
178
+ -->
179
+
180
+ <!--
181
+ ### Out-of-Scope Use
182
+
183
+ *List how the model may foreseeably be misused and address what users ought not to do with the model.*
184
+ -->
185
+
186
+ ## Evaluation
187
+
188
+ ### Metrics
189
+
190
+ #### Semantic Similarity
191
+ * Dataset: `sts-dev`
192
+ * Evaluated with [<code>EmbeddingSimilarityEvaluator</code>](https://sbert.net/docs/package_reference/sentence_transformer/evaluation.html#sentence_transformers.evaluation.EmbeddingSimilarityEvaluator)
193
+
194
+ | Metric | Value |
195
+ |:-------------------|:-----------|
196
+ | pearson_cosine | -0.0516 |
197
+ | spearman_cosine | -0.0593 |
198
+ | pearson_manhattan | -0.0643 |
199
+ | spearman_manhattan | -0.066 |
200
+ | pearson_euclidean | -0.0637 |
201
+ | spearman_euclidean | -0.0653 |
202
+ | pearson_dot | -0.0279 |
203
+ | spearman_dot | -0.026 |
204
+ | pearson_max | -0.0279 |
205
+ | **spearman_max** | **-0.026** |
206
+
207
+ <!--
208
+ ## Bias, Risks and Limitations
209
+
210
+ *What are the known or foreseeable issues stemming from this model? You could also flag here known failure cases or weaknesses of the model.*
211
+ -->
212
+
213
+ <!--
214
+ ### Recommendations
215
+
216
+ *What are recommendations with respect to the foreseeable issues? For example, filtering explicit content.*
217
+ -->
218
+
219
+ ## Training Details
220
+
221
+ ### Training Dataset
222
+
223
+ #### Unnamed Dataset
224
+
225
+
226
+ * Size: 10,330 training samples
227
+ * Columns: <code>sentence_0</code>, <code>sentence_1</code>, and <code>label</code>
228
+ * Approximate statistics based on the first 1000 samples:
229
+ | | sentence_0 | sentence_1 | label |
230
+ |:--------|:------------------------------------------------------------------------------------|:----------------------------------------------------------------------------------|:-------------------------------------------------------------------|
231
+ | type | string | string | int |
232
+ | details | <ul><li>min: 11 tokens</li><li>mean: 29.14 tokens</li><li>max: 179 tokens</li></ul> | <ul><li>min: 5 tokens</li><li>mean: 11.95 tokens</li><li>max: 41 tokens</li></ul> | <ul><li>0: ~36.30%</li><li>1: ~32.90%</li><li>2: ~30.80%</li></ul> |
233
+ * Samples:
234
+ | sentence_0 | sentence_1 | label |
235
+ |:-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|:------------------------------------------------------------------------------------|:---------------|
236
+ | <code>Pada tahun 1436, pulau Timor mempunyai 12 kota bandar namun tidak disebutkan namanya.</code> | <code>Pulau Timor memiliki 10 kota bandar.</code> | <code>2</code> |
237
+ | <code>Komoditas pertanian yang ada di desa ini antara lain: bunga potong, sayur mayur, waluh (lejet) terutama Paprika (Capsicum annuum L.). Komoditas ini menjadi sumber perekonomian utama di desa ini karena harganya yang lumayan dibandingkan sayuran lain.</code> | <code>Komoditas pertanian di desa ini lebih mahal dibandingkan sayuran lain.</code> | <code>1</code> |
238
+ | <code>Setelah batas waktu pencalonan pada tanggal 15 Juli 2003, sembilan kota telah mencalonkan diri untuk mengadakan Olimpiade 2012. Kota-kota tersebut adalah Havana, Istanbul, Leipzig, London, Madrid, Moskwa, New York City, Paris, dan Rio de Janeiro. Pada 18 Mei 2004, Komite Olimpiade Internasional (IOC), sebagai hasil penilaian teknis, mengurangi jumlah kota kandidat menjadi lima: London, Madrid, Moskwa, New York, dan Paris.</code> | <code>Jumlah kota kandidat tuan rumah olimpide bertambah pada 18 Mei 2004.</code> | <code>2</code> |
239
+ * Loss: [<code>MultipleNegativesRankingLoss</code>](https://sbert.net/docs/package_reference/sentence_transformer/losses.html#multiplenegativesrankingloss) with these parameters:
240
+ ```json
241
+ {
242
+ "scale": 20.0,
243
+ "similarity_fct": "cos_sim"
244
+ }
245
+ ```
246
+
247
+ ### Training Hyperparameters
248
+ #### Non-Default Hyperparameters
249
+
250
+ - `eval_strategy`: steps
251
+ - `per_device_train_batch_size`: 32
252
+ - `per_device_eval_batch_size`: 32
253
+ - `multi_dataset_batch_sampler`: round_robin
254
+
255
+ #### All Hyperparameters
256
+ <details><summary>Click to expand</summary>
257
+
258
+ - `overwrite_output_dir`: False
259
+ - `do_predict`: False
260
+ - `eval_strategy`: steps
261
+ - `prediction_loss_only`: True
262
+ - `per_device_train_batch_size`: 32
263
+ - `per_device_eval_batch_size`: 32
264
+ - `per_gpu_train_batch_size`: None
265
+ - `per_gpu_eval_batch_size`: None
266
+ - `gradient_accumulation_steps`: 1
267
+ - `eval_accumulation_steps`: None
268
+ - `learning_rate`: 5e-05
269
+ - `weight_decay`: 0.0
270
+ - `adam_beta1`: 0.9
271
+ - `adam_beta2`: 0.999
272
+ - `adam_epsilon`: 1e-08
273
+ - `max_grad_norm`: 1
274
+ - `num_train_epochs`: 3
275
+ - `max_steps`: -1
276
+ - `lr_scheduler_type`: linear
277
+ - `lr_scheduler_kwargs`: {}
278
+ - `warmup_ratio`: 0.0
279
+ - `warmup_steps`: 0
280
+ - `log_level`: passive
281
+ - `log_level_replica`: warning
282
+ - `log_on_each_node`: True
283
+ - `logging_nan_inf_filter`: True
284
+ - `save_safetensors`: True
285
+ - `save_on_each_node`: False
286
+ - `save_only_model`: False
287
+ - `restore_callback_states_from_checkpoint`: False
288
+ - `no_cuda`: False
289
+ - `use_cpu`: False
290
+ - `use_mps_device`: False
291
+ - `seed`: 42
292
+ - `data_seed`: None
293
+ - `jit_mode_eval`: False
294
+ - `use_ipex`: False
295
+ - `bf16`: False
296
+ - `fp16`: False
297
+ - `fp16_opt_level`: O1
298
+ - `half_precision_backend`: auto
299
+ - `bf16_full_eval`: False
300
+ - `fp16_full_eval`: False
301
+ - `tf32`: None
302
+ - `local_rank`: 0
303
+ - `ddp_backend`: None
304
+ - `tpu_num_cores`: None
305
+ - `tpu_metrics_debug`: False
306
+ - `debug`: []
307
+ - `dataloader_drop_last`: False
308
+ - `dataloader_num_workers`: 0
309
+ - `dataloader_prefetch_factor`: None
310
+ - `past_index`: -1
311
+ - `disable_tqdm`: False
312
+ - `remove_unused_columns`: True
313
+ - `label_names`: None
314
+ - `load_best_model_at_end`: False
315
+ - `ignore_data_skip`: False
316
+ - `fsdp`: []
317
+ - `fsdp_min_num_params`: 0
318
+ - `fsdp_config`: {'min_num_params': 0, 'xla': False, 'xla_fsdp_v2': False, 'xla_fsdp_grad_ckpt': False}
319
+ - `fsdp_transformer_layer_cls_to_wrap`: None
320
+ - `accelerator_config`: {'split_batches': False, 'dispatch_batches': None, 'even_batches': True, 'use_seedable_sampler': True, 'non_blocking': False, 'gradient_accumulation_kwargs': None}
321
+ - `deepspeed`: None
322
+ - `label_smoothing_factor`: 0.0
323
+ - `optim`: adamw_torch
324
+ - `optim_args`: None
325
+ - `adafactor`: False
326
+ - `group_by_length`: False
327
+ - `length_column_name`: length
328
+ - `ddp_find_unused_parameters`: None
329
+ - `ddp_bucket_cap_mb`: None
330
+ - `ddp_broadcast_buffers`: False
331
+ - `dataloader_pin_memory`: True
332
+ - `dataloader_persistent_workers`: False
333
+ - `skip_memory_metrics`: True
334
+ - `use_legacy_prediction_loop`: False
335
+ - `push_to_hub`: False
336
+ - `resume_from_checkpoint`: None
337
+ - `hub_model_id`: None
338
+ - `hub_strategy`: every_save
339
+ - `hub_private_repo`: False
340
+ - `hub_always_push`: False
341
+ - `gradient_checkpointing`: False
342
+ - `gradient_checkpointing_kwargs`: None
343
+ - `include_inputs_for_metrics`: False
344
+ - `eval_do_concat_batches`: True
345
+ - `fp16_backend`: auto
346
+ - `push_to_hub_model_id`: None
347
+ - `push_to_hub_organization`: None
348
+ - `mp_parameters`:
349
+ - `auto_find_batch_size`: False
350
+ - `full_determinism`: False
351
+ - `torchdynamo`: None
352
+ - `ray_scope`: last
353
+ - `ddp_timeout`: 1800
354
+ - `torch_compile`: False
355
+ - `torch_compile_backend`: None
356
+ - `torch_compile_mode`: None
357
+ - `dispatch_batches`: None
358
+ - `split_batches`: None
359
+ - `include_tokens_per_second`: False
360
+ - `include_num_input_tokens_seen`: False
361
+ - `neftune_noise_alpha`: None
362
+ - `optim_target_modules`: None
363
+ - `batch_eval_metrics`: False
364
+ - `batch_sampler`: batch_sampler
365
+ - `multi_dataset_batch_sampler`: round_robin
366
+
367
+ </details>
368
+
369
+ ### Training Logs
370
+ | Epoch | Step | Training Loss | sts-dev_spearman_max |
371
+ |:------:|:----:|:-------------:|:--------------------:|
372
+ | 0.0991 | 32 | - | -0.0592 |
373
+ | 0.1981 | 64 | - | -0.0425 |
374
+ | 0.2972 | 96 | - | -0.0467 |
375
+ | 0.3963 | 128 | - | -0.0428 |
376
+ | 0.4954 | 160 | - | -0.0512 |
377
+ | 0.5944 | 192 | - | -0.0473 |
378
+ | 0.6935 | 224 | - | -0.0412 |
379
+ | 0.7926 | 256 | - | -0.0435 |
380
+ | 0.8916 | 288 | - | -0.0405 |
381
+ | 0.9907 | 320 | - | -0.0425 |
382
+ | 1.0 | 323 | - | -0.0420 |
383
+ | 1.0898 | 352 | - | -0.0346 |
384
+ | 1.1889 | 384 | - | -0.0333 |
385
+ | 1.2879 | 416 | - | -0.0325 |
386
+ | 1.3870 | 448 | - | -0.0312 |
387
+ | 1.4861 | 480 | - | -0.0316 |
388
+ | 1.5480 | 500 | 0.077 | - |
389
+ | 1.5851 | 512 | - | -0.0260 |
390
+
391
+
392
+ ### Framework Versions
393
+ - Python: 3.10.12
394
+ - Sentence Transformers: 3.0.1
395
+ - Transformers: 4.41.2
396
+ - PyTorch: 2.3.0+cu121
397
+ - Accelerate: 0.31.0
398
+ - Datasets: 2.19.2
399
+ - Tokenizers: 0.19.1
400
+
401
+ ## Citation
402
+
403
+ ### BibTeX
404
+
405
+ #### Sentence Transformers
406
+ ```bibtex
407
+ @inproceedings{reimers-2019-sentence-bert,
408
+ title = "Sentence-BERT: Sentence Embeddings using Siamese BERT-Networks",
409
+ author = "Reimers, Nils and Gurevych, Iryna",
410
+ booktitle = "Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing",
411
+ month = "11",
412
+ year = "2019",
413
+ publisher = "Association for Computational Linguistics",
414
+ url = "https://arxiv.org/abs/1908.10084",
415
+ }
416
+ ```
417
+
418
+ #### MultipleNegativesRankingLoss
419
+ ```bibtex
420
+ @misc{henderson2017efficient,
421
+ title={Efficient Natural Language Response Suggestion for Smart Reply},
422
+ author={Matthew Henderson and Rami Al-Rfou and Brian Strope and Yun-hsuan Sung and Laszlo Lukacs and Ruiqi Guo and Sanjiv Kumar and Balint Miklos and Ray Kurzweil},
423
+ year={2017},
424
+ eprint={1705.00652},
425
+ archivePrefix={arXiv},
426
+ primaryClass={cs.CL}
427
+ }
428
+ ```
429
+
430
+ <!--
431
+ ## Glossary
432
+
433
+ *Clearly define terms in order to be accessible across audiences.*
434
+ -->
435
+
436
+ <!--
437
+ ## Model Card Authors
438
+
439
+ *Lists the people who create the model card, providing recognition and accountability for the detailed work that goes into its construction.*
440
+ -->
441
+
442
+ <!--
443
+ ## Model Card Contact
444
+
445
+ *Provides a way for people who have updates to the Model Card, suggestions, or questions, to contact the Model Card authors.*
446
+ -->
config.json ADDED
@@ -0,0 +1,47 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "_name_or_path": "indobenchmark/indobert-base-p2",
3
+ "_num_labels": 5,
4
+ "architectures": [
5
+ "BertModel"
6
+ ],
7
+ "attention_probs_dropout_prob": 0.1,
8
+ "classifier_dropout": null,
9
+ "directionality": "bidi",
10
+ "hidden_act": "gelu",
11
+ "hidden_dropout_prob": 0.1,
12
+ "hidden_size": 768,
13
+ "id2label": {
14
+ "0": "LABEL_0",
15
+ "1": "LABEL_1",
16
+ "2": "LABEL_2",
17
+ "3": "LABEL_3",
18
+ "4": "LABEL_4"
19
+ },
20
+ "initializer_range": 0.02,
21
+ "intermediate_size": 3072,
22
+ "label2id": {
23
+ "LABEL_0": 0,
24
+ "LABEL_1": 1,
25
+ "LABEL_2": 2,
26
+ "LABEL_3": 3,
27
+ "LABEL_4": 4
28
+ },
29
+ "layer_norm_eps": 1e-12,
30
+ "max_position_embeddings": 512,
31
+ "model_type": "bert",
32
+ "num_attention_heads": 12,
33
+ "num_hidden_layers": 12,
34
+ "output_past": true,
35
+ "pad_token_id": 0,
36
+ "pooler_fc_size": 768,
37
+ "pooler_num_attention_heads": 12,
38
+ "pooler_num_fc_layers": 3,
39
+ "pooler_size_per_head": 128,
40
+ "pooler_type": "first_token_transform",
41
+ "position_embedding_type": "absolute",
42
+ "torch_dtype": "float32",
43
+ "transformers_version": "4.41.2",
44
+ "type_vocab_size": 2,
45
+ "use_cache": true,
46
+ "vocab_size": 50000
47
+ }
config_sentence_transformers.json ADDED
@@ -0,0 +1,10 @@
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "__version__": {
3
+ "sentence_transformers": "3.0.1",
4
+ "transformers": "4.41.2",
5
+ "pytorch": "2.3.0+cu121"
6
+ },
7
+ "prompts": {},
8
+ "default_prompt_name": null,
9
+ "similarity_fn_name": null
10
+ }
model.safetensors ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:512723ee70ae5f1afc06c3eb7e0681bd61df080baaa9e535d04d80e3ee9bb98e
3
+ size 497787752
modules.json ADDED
@@ -0,0 +1,14 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ [
2
+ {
3
+ "idx": 0,
4
+ "name": "0",
5
+ "path": "",
6
+ "type": "sentence_transformers.models.Transformer"
7
+ },
8
+ {
9
+ "idx": 1,
10
+ "name": "1",
11
+ "path": "1_Pooling",
12
+ "type": "sentence_transformers.models.Pooling"
13
+ }
14
+ ]
pytorch_model.bin ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:bb1b8d94c2acbb6f07b1b5517b79bf6a9f831391597f9be9dd0855d9437cbe05
3
+ size 497850982
sentence_bert_config.json ADDED
@@ -0,0 +1,4 @@
 
 
 
 
 
1
+ {
2
+ "max_seq_length": 200,
3
+ "do_lower_case": false
4
+ }
special_tokens_map.json ADDED
@@ -0,0 +1,7 @@
 
 
 
 
 
 
 
 
1
+ {
2
+ "cls_token": "[CLS]",
3
+ "mask_token": "[MASK]",
4
+ "pad_token": "[PAD]",
5
+ "sep_token": "[SEP]",
6
+ "unk_token": "[UNK]"
7
+ }
tokenizer.json ADDED
The diff for this file is too large to render. See raw diff
 
tokenizer_config.json ADDED
@@ -0,0 +1,57 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "added_tokens_decoder": {
3
+ "0": {
4
+ "content": "[PAD]",
5
+ "lstrip": false,
6
+ "normalized": false,
7
+ "rstrip": false,
8
+ "single_word": false,
9
+ "special": true
10
+ },
11
+ "1": {
12
+ "content": "[UNK]",
13
+ "lstrip": false,
14
+ "normalized": false,
15
+ "rstrip": false,
16
+ "single_word": false,
17
+ "special": true
18
+ },
19
+ "2": {
20
+ "content": "[CLS]",
21
+ "lstrip": false,
22
+ "normalized": false,
23
+ "rstrip": false,
24
+ "single_word": false,
25
+ "special": true
26
+ },
27
+ "3": {
28
+ "content": "[SEP]",
29
+ "lstrip": false,
30
+ "normalized": false,
31
+ "rstrip": false,
32
+ "single_word": false,
33
+ "special": true
34
+ },
35
+ "4": {
36
+ "content": "[MASK]",
37
+ "lstrip": false,
38
+ "normalized": false,
39
+ "rstrip": false,
40
+ "single_word": false,
41
+ "special": true
42
+ }
43
+ },
44
+ "clean_up_tokenization_spaces": true,
45
+ "cls_token": "[CLS]",
46
+ "do_basic_tokenize": true,
47
+ "do_lower_case": true,
48
+ "mask_token": "[MASK]",
49
+ "model_max_length": 200,
50
+ "never_split": null,
51
+ "pad_token": "[PAD]",
52
+ "sep_token": "[SEP]",
53
+ "strip_accents": null,
54
+ "tokenize_chinese_chars": true,
55
+ "tokenizer_class": "BertTokenizer",
56
+ "unk_token": "[UNK]"
57
+ }
vocab.txt ADDED
The diff for this file is too large to render. See raw diff