File size: 15,740 Bytes
fa038e6 |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 214 215 216 217 218 219 220 221 222 223 224 225 226 227 228 229 230 231 232 233 234 235 236 237 238 239 240 241 242 243 244 245 246 247 248 249 250 251 252 253 254 255 256 257 258 259 260 261 262 263 264 265 266 267 268 269 270 271 272 273 274 275 276 277 278 |
---
base_model: dunzhang/stella_en_1.5B_v5
datasets: []
language: []
library_name: sentence-transformers
pipeline_tag: sentence-similarity
tags:
- sentence-transformers
- sentence-similarity
- feature-extraction
- generated_from_trainer
- dataset_size:99000
- loss:MultipleNegativesSymmetricRankingLoss
widget:
- source_sentence: 'Instruct: Given a web search query, retrieve relevant passages
that answer the query.
Query: Glay'
sentences:
- The Theory of Good and Evil is a 1907 book about ethics by the English philosopher
Hastings Rashdall, in which the author expounds a theory he calls "ideal utilitarianism".
It has been seen as Rashdall's most important philosophical work.
- GLAY is a Japanese rock band , formed in Hakodate in 1988 . Glay primarily composes
songs in the rock and pop genres , but they have also arranged songs using elements
from a wide variety of genres , including punk , electronic , R&B , progressive
rock , folk , reggae , gospel , and ska . Originally a visual kei band , the group
slowly shifted to less dramatic attire through the years . As of 2008 , Glay had
sold an estimated 51 million records ; 28 million singles and 23 million albums
, making them one of the top ten best-selling artists of all time in Japan .
- Aashirwad is a 1968 Bollywood film , directed by Hrishikesh Mukherjee . The film
stars Ashok Kumar and Sanjeev Kumar . The film is notable for its inclusion
of a rap-like song performed by Ashok Kumar , `` Rail Gaadi '' .
- source_sentence: 'Instruct: Given a web search query, retrieve relevant passages
that answer the query.
Query: Indexing does not work with index package'
sentences:
- 'I am trying to do indexing with the following code: \documentclass[a4paper]{article} \usepackage{index} \makeindex \newindex{aut}{adx}{and}{Name
Index} \begin{document} Hellow \index[aut]{FiRST} \printindex[aut] \end{document} Acccording
to documention of the `index` package it should work. But makeindex creates empty
`.idx` and `.ind`. If I run code like this: \documentclass[a4paper]{article} \usepackage{index} \makeindex \begin{document} Hellow
\index{FiRST} \printindex \end{document} It runs. But I need to have
user-defined index. Please help me with it. I''ve searched for several hours on
internet, but without success.'
- 'Body materials may include, but are not limited to, any of these materials:'
- Berberis aemulans is a shrub endemic to the region of Sichuan in southern China.
It grows there in thickets and on slopes at elevations of 2900-3200 m.Berberis
aemulans is a deciduous shrub up to 2 m tall, with spines along the branches.
Leaves are simple, elliptical to ovate, up to 4 cm long, lighter in color on the
underside because of a waxy layer. Flowers are in simple racemes of only a few
flowers. Berries egg-shaped, orange, up to 16 mm long.
- source_sentence: 'Instruct: Given a web search query, retrieve relevant passages
that answer the query.
Query: Parodi''s hemispingus'
sentences:
- Another event dubbed a "Battle of the Sexes" took place during the 1998 Australian
Open[51] between Karsten Braasch and the Williams sisters. Venus and Serena Williams
had claimed that they could beat any male player ranked outside the world's top
200, so Braasch, then ranked 203rd, challenged them both. Braasch was described
by one journalist as "a man whose training regime centered around a pack of cigarettes
and more than a couple bottles of ice cold lager".[52][51] The matches took place
on court number 12 in Melbourne Park,[53] after Braasch had finished a round of
golf and two shandies. He first took on Serena and after leading 5–0, beat her
6–1. Venus then walked on court and again Braasch was victorious, this time winning
6–2.[54] Braasch said afterwards, "500 and above, no chance". He added that he
had played like someone ranked 600th in order to keep the game "fun".[55] Braasch
said the big difference was that men can chase down shots much easier, and that
men put spin on the ball that the women can't handle. The Williams sisters adjusted
their claim to beating men outside the top 350.[51]
- The Parodi 's hemispingus ( Hemispingus parodii ) is a species of bird in the
family Thraupidae that is endemic to Peru . Its natural habitat is subtropical
or tropical moist montane forests .
- 'I need help because my Minecraft launcher doesn''t work... It''s been a long
time I haven''t played Minecraft and until now it worked nicely. But now that
I want to play on it again and I run the launcher, this appears (click images
to enlarge): ![enter image description here](http://i.stack.imgur.com/hvD9R.png)
At the bottom left of the screen the profile names keep loading (normally my username
appears in the box) and as you can see I am unable to click on the "Play" button.
I tried creating another profile but it doesn''t work because soon after they
ask to enter my Minecraft username and password. The password I entered disappears
and it keeps loading (I''ve tried waiting like, 30 minutes and it still doesn''t
work) so this is definitely not normal. ![enter image description here](http://i.stack.imgur.com/yDYjX.png)
![enter image description here](http://i.stack.imgur.com/4Nf1L.png) ![enter image
description here](http://i.stack.imgur.com/T6cJu.png) So basically I can''t play
on Minecraft anymore (version 1.7.9)... P.S. I use Windows 7.'
- source_sentence: 'Instruct: Given a web search query, retrieve relevant passages
that answer the query.
Query: Mahabharata'
sentences:
- The epic employs the story within a story structure, otherwise known as frametales,
popular in many Indian religious and non-religious works. It is first recited
at Takshashila by the sage Vaiśampāyana,[12][13] a disciple of Vyāsa, to the King
Janamejaya who is the great-grandson of the Pāṇḍava prince Arjuna. The story is
then recited again by a professional storyteller named Ugraśrava Sauti, many years
later, to an assemblage of sages performing the 12-year sacrifice for the king
Saunaka Kulapati in the Naimiśa Forest.
- 'Guncati (Serbian Cyrillic: Гунцати) is a suburban settlement of Belgrade, the
capital of Serbia. It is located in the municipality of Barajevo.Guncati is located
west of the municipal seat of Barajevo, halfway between the Belgrade-Bar railway
and Ibarska magistrala (Highway of Ibar).It is a rural settlement with a steady
population growth: from 1,718 (Census 1991) to 2,102 (Census 2002).'
- Beck 's Brewery , also known as Brauerei Beck & Co. , is a brewery in the northern
German city of Bremen . In 2001 , Interbrew agreed to buy Brauerei Beck for 1.8
billion euro ; at that time it was the fourth largest brewer in Germany . US manufacture
of Beck 's Brew has been based in St. Louis , Missouri , since early 2012 but
some customers have rebelled against the US market version . Since 2008 , it
has been owned by the Interbrew subsidiary of Anheuser-Busch InBev SA/NV . The
Beck 's Art Label Campaign has offered artists the opportunity to provide designs
to replace the brand 's label . It started in London in 1987 with Gilbert and
George . The artists created an art label , because Beck 's sponsored their retrospective
at the Hayward Gallery . The labels of the 2000 limited edition Beck 's bottles
were matching their exhibition poster . Other participants of the Art Label Campaign
are members of the loose group `` Young British Artists '' and nominees or winners
of the Turner Prize . Damien Hirst for example , designed a label for Beck 's
in 1995 , showing his famous spots . In 2000 , Tracey Emin created a label , which
shows herself , posing in a bathtub . Furthermore , Rachel Whiteread designed
a label in 1993 , presenting her artwork `` house '' , which was also financed
by Beck 's . The Art Label Campaign has also been parodied by Matthew Higgs ,
who is a member of the British art collective `` Bank '' . In the Bank exhibition
`` The Charge of the Light Brigade '' in 1995 , he brewed a beer , called `` Kunstlerbrau
'' . In 2012 , Beck 's started giving young and independent musicians the opportunity
to design a label for the Beck 's bottle . Beck 's summer 2009 limited-edition
labels were designed by the musical groups Hard-Fi and Ladyhawke .
- source_sentence: 'Instruct: Given a web search query, retrieve relevant passages
that answer the query.
Query: Ahu A Umi Heiau'
sentences:
- The 1967 All-Ireland Intermediate Hurling Championship was the seventh staging
of the All-Ireland hurling championship. The championship ended on 17 September
1967.Tipperary were the defending champions, however, they were defeated in the
provincial championship. London won the title after defeating Cork by 1-9 to 1-5
in the final.
- 'The digit ratio is the ratio of the lengths of different digits or fingers typically
measured from the midpoint of bottom crease ( where the finger joins the hand
) to the tip of the finger . It has been suggested by some scientists that the
ratio of two digits in particular , the 2nd ( index finger ) and 4th ( ring finger
) , is affected by exposure to androgens , e.g. , testosterone while in the uterus
and that this 2D :4 D ratio can be considered a crude measure for prenatal androgen
exposure , with lower 2D :4 D ratios pointing to higher prenatal androgen exposure
. The 2D :4 D ratio is calculated by dividing the length of the index finger of
a given hand by the length of the ring finger of the same hand . A longer index
finger will result in a ratio higher than 1 , while a longer ring finger will
result in a ratio lower than 1 . The 2D :4 D digit ratio is sexually dimorphic
: although the second digit is typically shorter in both females and males , the
difference between the lengths of the two digits is greater in males than in females
. A number of studies have shown a correlation between the 2D :4 D digit ratio
and various physical and behavioral traits .'
- Ahu A ʻ Umi Heiau means "shrine at the temple of ʻ Umi" in the Hawaiian Language.
---
# SentenceTransformer based on dunzhang/stella_en_1.5B_v5
This is a [sentence-transformers](https://www.SBERT.net) model finetuned from [dunzhang/stella_en_1.5B_v5](https://huggingface.co/dunzhang/stella_en_1.5B_v5). It maps sentences & paragraphs to a 1024-dimensional dense vector space and can be used for semantic textual similarity, semantic search, paraphrase mining, text classification, clustering, and more.
## Model Details
### Model Description
- **Model Type:** Sentence Transformer
- **Base model:** [dunzhang/stella_en_1.5B_v5](https://huggingface.co/dunzhang/stella_en_1.5B_v5) <!-- at revision 129dc50d3ca5f0f5ee0ce8944f65a8553c0f26e0 -->
- **Maximum Sequence Length:** 8096 tokens
- **Output Dimensionality:** 1024 tokens
- **Similarity Function:** Cosine Similarity
<!-- - **Training Dataset:** Unknown -->
<!-- - **Language:** Unknown -->
<!-- - **License:** Unknown -->
### Model Sources
- **Documentation:** [Sentence Transformers Documentation](https://sbert.net)
- **Repository:** [Sentence Transformers on GitHub](https://github.com/UKPLab/sentence-transformers)
- **Hugging Face:** [Sentence Transformers on Hugging Face](https://huggingface.co/models?library=sentence-transformers)
### Full Model Architecture
```
SentenceTransformer(
(0): Transformer({'max_seq_length': 8096, 'do_lower_case': False}) with Transformer model: Qwen2Model
(1): Pooling({'word_embedding_dimension': 1536, 'pooling_mode_cls_token': False, 'pooling_mode_mean_tokens': True, 'pooling_mode_max_tokens': False, 'pooling_mode_mean_sqrt_len_tokens': False, 'pooling_mode_weightedmean_tokens': False, 'pooling_mode_lasttoken': False, 'include_prompt': True})
(2): Dense({'in_features': 1536, 'out_features': 1024, 'bias': True, 'activation_function': 'torch.nn.modules.linear.Identity'})
)
```
## Usage
### Direct Usage (Sentence Transformers)
First install the Sentence Transformers library:
```bash
pip install -U sentence-transformers
```
Then you can load this model and run inference.
```python
from sentence_transformers import SentenceTransformer
# Download from the 🤗 Hub
model = SentenceTransformer("sentence_transformers_model_id")
# Run inference
sentences = [
'Instruct: Given a web search query, retrieve relevant passages that answer the query.\nQuery: Ahu A Umi Heiau',
'Ahu A ʻ Umi Heiau means "shrine at the temple of ʻ Umi" in the Hawaiian Language.',
'The digit ratio is the ratio of the lengths of different digits or fingers typically measured from the midpoint of bottom crease ( where the finger joins the hand ) to the tip of the finger . It has been suggested by some scientists that the ratio of two digits in particular , the 2nd ( index finger ) and 4th ( ring finger ) , is affected by exposure to androgens , e.g. , testosterone while in the uterus and that this 2D :4 D ratio can be considered a crude measure for prenatal androgen exposure , with lower 2D :4 D ratios pointing to higher prenatal androgen exposure . The 2D :4 D ratio is calculated by dividing the length of the index finger of a given hand by the length of the ring finger of the same hand . A longer index finger will result in a ratio higher than 1 , while a longer ring finger will result in a ratio lower than 1 . The 2D :4 D digit ratio is sexually dimorphic : although the second digit is typically shorter in both females and males , the difference between the lengths of the two digits is greater in males than in females . A number of studies have shown a correlation between the 2D :4 D digit ratio and various physical and behavioral traits .',
]
embeddings = model.encode(sentences)
print(embeddings.shape)
# [3, 1024]
# Get the similarity scores for the embeddings
similarities = model.similarity(embeddings, embeddings)
print(similarities.shape)
# [3, 3]
```
<!--
### Direct Usage (Transformers)
<details><summary>Click to see the direct usage in Transformers</summary>
</details>
-->
<!--
### Downstream Usage (Sentence Transformers)
You can finetune this model on your own dataset.
<details><summary>Click to expand</summary>
</details>
-->
<!--
### Out-of-Scope Use
*List how the model may foreseeably be misused and address what users ought not to do with the model.*
-->
<!--
## Bias, Risks and Limitations
*What are the known or foreseeable issues stemming from this model? You could also flag here known failure cases or weaknesses of the model.*
-->
<!--
### Recommendations
*What are recommendations with respect to the foreseeable issues? For example, filtering explicit content.*
-->
### Training Logs
| Epoch | Step | Training Loss | retrival loss |
|:------:|:----:|:-------------:|:-------------:|
| 0.6466 | 500 | 0.0424 | 0.0060 |
| 1.2932 | 1000 | 0.0073 | 0.0040 |
<!--
## Glossary
*Clearly define terms in order to be accessible across audiences.*
-->
<!--
## Model Card Authors
*Lists the people who create the model card, providing recognition and accountability for the detailed work that goes into its construction.*
-->
<!--
## Model Card Contact
*Provides a way for people who have updates to the Model Card, suggestions, or questions, to contact the Model Card authors.*
--> |