l3cube-pune commited on
Commit
bcc1c2f
1 Parent(s): d3fe12c

Update model files

Browse files
README.md CHANGED
@@ -5,67 +5,14 @@ tags:
5
  - feature-extraction
6
  - sentence-similarity
7
  - transformers
8
- language:
9
- - multilingual
10
- - hi
11
- - mr
12
- - kn
13
- - ta
14
- - te
15
- - ml
16
- - gu
17
- - or
18
- - pa
19
- - bn
20
- widget:
21
- - source_sentence: दिवाळी आपण मोठ्या उत्साहाने साजरी करतो
22
- sentences:
23
- - दिवाळी आपण आनंदाने साजरी करतो
24
- - दिवाळी हा दिव्यांचा सण आहे
25
- example_title: Monolingual- Marathi
26
- - source_sentence: हम दीपावली उत्साह के साथ मनाते हैं
27
- sentences:
28
- - हम दीपावली खुशियों से मनाते हैं
29
- - दिवाली रोशनी का त्योहार है
30
- example_title: Monolingual- Hindi
31
- - source_sentence: અમે ઉત્સાહથી દિવાળી ઉજવીએ છીએ
32
- sentences:
33
- - દિવાળી આપણે ખુશીઓથી ઉજવીએ છીએ
34
- - દિવાળી એ રોશનીનો તહેવાર છે
35
- example_title: Monolingual- Gujarati
36
- - source_sentence: आम्हाला भारतीय असल्याचा अभिमान आहे
37
- sentences:
38
- - हमें भारतीय होने पर गर्व है
39
- - భారతీయులమైనందుకు గర్విస్తున్నాం
40
- - અમને ભારતીય હોવાનો ગર્વ છે
41
- example_title: Cross-lingual 1
42
- - source_sentence: ਬਾਰਿਸ਼ ਤੋਂ ਬਾਅਦ ਬਗੀਚਾ ਸੁੰਦਰ ਦਿਖਾਈ ਦਿੰਦਾ ਹੈ
43
- sentences:
44
- - മഴയ്ക്ക് ശേഷം പൂന്തോട്ടം മനോഹരമായി കാണപ്പെടുന്നു
45
- - ବର୍ଷା ପରେ ବଗିଚା ସୁନ୍ଦର ଦେଖାଯାଏ |
46
- - बारिश के बाद बगीचा सुंदर दिखता है
47
- example_title: Cross-lingual 2
48
- ---
49
-
50
- # IndicSBERT
51
 
52
- This is a MuRIL model (google/muril-base-cased) trained on the NLI dataset of ten major Indian Languages. <br>
53
- The single model works for Hindi, Marathi, Kannada, Tamil, Telugu, Gujarati, Oriya, Punjabi, Malayalam, and Bengali.
54
- The model also has cross-lingual capabilities. <br>
55
- Released as a part of project MahaNLP: https://github.com/l3cube-pune/MarathiNLP <br>
56
 
57
- A better sentence similarity model (fine-tuned version of this model) is shared here: https://huggingface.co/l3cube-pune/indic-sentence-similarity-sbert <br>
58
 
59
- More details on the dataset, models, and baseline results can be found in our [paper] (https://arxiv.org/abs/2211.11187)
60
 
61
- ```
62
- @article{joshi2022l3cubemahasbert,
63
- title={L3Cube-MahaSBERT and HindSBERT: Sentence BERT Models and Benchmarking BERT Sentence Representations for Hindi and Marathi},
64
- author={Joshi, Ananya and Kajale, Aditi and Gadre, Janhavi and Deode, Samruddhi and Joshi, Raviraj},
65
- journal={arXiv preprint arXiv:2211.11187},
66
- year={2022}
67
- }
68
- ```
69
 
70
  ## Usage (Sentence-Transformers)
71
 
@@ -122,4 +69,61 @@ sentence_embeddings = mean_pooling(model_output, encoded_input['attention_mask']
122
 
123
  print("Sentence embeddings:")
124
  print(sentence_embeddings)
125
- ```
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
5
  - feature-extraction
6
  - sentence-similarity
7
  - transformers
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
8
 
9
+ ---
 
 
 
10
 
11
+ # {MODEL_NAME}
12
 
13
+ This is a [sentence-transformers](https://www.SBERT.net) model: It maps sentences & paragraphs to a 768 dimensional dense vector space and can be used for tasks like clustering or semantic search.
14
 
15
+ <!--- Describe your model here -->
 
 
 
 
 
 
 
16
 
17
  ## Usage (Sentence-Transformers)
18
 
 
69
 
70
  print("Sentence embeddings:")
71
  print(sentence_embeddings)
72
+ ```
73
+
74
+
75
+
76
+ ## Evaluation Results
77
+
78
+ <!--- Describe how your model was evaluated -->
79
+
80
+ For an automated evaluation of this model, see the *Sentence Embeddings Benchmark*: [https://seb.sbert.net](https://seb.sbert.net?model_name={MODEL_NAME})
81
+
82
+
83
+ ## Training
84
+ The model was trained with the parameters:
85
+
86
+ **DataLoader**:
87
+
88
+ `sentence_transformers.datasets.NoDuplicatesDataLoader.NoDuplicatesDataLoader` of length 88058 with parameters:
89
+ ```
90
+ {'batch_size': 32}
91
+ ```
92
+
93
+ **Loss**:
94
+
95
+ `sentence_transformers.losses.MultipleNegativesRankingLoss.MultipleNegativesRankingLoss` with parameters:
96
+ ```
97
+ {'scale': 20.0, 'similarity_fct': 'cos_sim'}
98
+ ```
99
+
100
+ Parameters of the fit()-Method:
101
+ ```
102
+ {
103
+ "epochs": 1,
104
+ "evaluation_steps": 0,
105
+ "evaluator": "sentence_transformers.evaluation.EmbeddingSimilarityEvaluator.EmbeddingSimilarityEvaluator",
106
+ "max_grad_norm": 1,
107
+ "optimizer_class": "<class 'torch.optim.adamw.AdamW'>",
108
+ "optimizer_params": {
109
+ "lr": 2e-05
110
+ },
111
+ "scheduler": "WarmupLinear",
112
+ "steps_per_epoch": null,
113
+ "warmup_steps": 8805,
114
+ "weight_decay": 0.01
115
+ }
116
+ ```
117
+
118
+
119
+ ## Full Model Architecture
120
+ ```
121
+ SentenceTransformer(
122
+ (0): Transformer({'max_seq_length': 512, 'do_lower_case': False}) with Transformer model: BertModel
123
+ (1): Pooling({'word_embedding_dimension': 768, 'pooling_mode_cls_token': False, 'pooling_mode_mean_tokens': True, 'pooling_mode_max_tokens': False, 'pooling_mode_mean_sqrt_len_tokens': False})
124
+ )
125
+ ```
126
+
127
+ ## Citing & Authors
128
+
129
+ <!--- Describe where people can find more information -->
config_sentence_transformers.json CHANGED
@@ -1,7 +1,7 @@
1
  {
2
  "__version__": {
3
  "sentence_transformers": "2.2.2",
4
- "transformers": "4.25.1",
5
- "pytorch": "1.13.0+cu116"
6
  }
7
  }
 
1
  {
2
  "__version__": {
3
  "sentence_transformers": "2.2.2",
4
+ "transformers": "4.26.1",
5
+ "pytorch": "1.13.1+cu116"
6
  }
7
  }
pytorch_model.bin CHANGED
@@ -1,3 +1,3 @@
1
  version https://git-lfs.github.com/spec/v1
2
- oid sha256:c343aede4587c66882101abd230dbc2780ae60d2951ec0ec8165736aa6f73b5d
3
  size 950293293
 
1
  version https://git-lfs.github.com/spec/v1
2
+ oid sha256:2a334f3074c87d57afa774c616d9d6e8c80db62c75afbb4b65bfadd63c72a169
3
  size 950293293