bobox's picture
Training in progress, step 8811, checkpoint
0a329a0 verified
|
raw
history blame
219 kB
metadata
language:
  - en
library_name: sentence-transformers
tags:
  - sentence-transformers
  - sentence-similarity
  - feature-extraction
  - generated_from_trainer
  - dataset_size:187790
  - loss:AdaptiveLayerLoss
  - loss:CoSENTLoss
  - loss:GISTEmbedLoss
  - loss:OnlineContrastiveLoss
  - loss:MultipleNegativesSymmetricRankingLoss
base_model: microsoft/deberta-v3-small
datasets:
  - sentence-transformers/all-nli
  - sentence-transformers/stsb
  - tals/vitaminc
  - nyu-mll/glue
  - allenai/scitail
  - sentence-transformers/xsum
  - sentence-transformers/sentence-compression
  - allenai/sciq
  - allenai/qasc
  - allenai/openbookqa
  - sentence-transformers/msmarco-msmarco-distilbert-base-v3
  - sentence-transformers/natural-questions
  - sentence-transformers/trivia-qa
  - sentence-transformers/quora-duplicates
  - sentence-transformers/gooaq
metrics:
  - pearson_cosine
  - spearman_cosine
  - pearson_manhattan
  - spearman_manhattan
  - pearson_euclidean
  - spearman_euclidean
  - pearson_dot
  - spearman_dot
  - pearson_max
  - spearman_max
widget:
  - source_sentence: What are predators?
    sentences:
      - a windmill does not create pollution
      - fire causes burning
      - carnivores are predators
  - source_sentence: A man in a black shirt is playing a guitar.
    sentences:
      - The man is wearing black.
      - There are people on the subway.
      - Boy dressed in blue holds a toy.
  - source_sentence: >-
      The Infrared Detector Laboratory built the Near Infrared Camera and
      Multi-Object Spectrometer (NICMOS) instrument for the Hubble Space
      Telescope and the Multiband Imaging Photometer (MIPS) instrument for the
      Spitzer Space Telescope.
    sentences:
      - Mollusks can be divided into seven classes.
      - A telescope is used to make objects in space appear closer.
      - The energy content of foods is often expressed in calories.
  - source_sentence: >-
      Glucocorticoids and mineralocorticoids are the two main types of
      corticosteroids in humans.
    sentences:
      - >-
        Glucocorticoids and mineralocorticoids are the two main types of what in
        humans?
      - >-
        The thick skin, found only on the palms of the hands and the soles of
        the feet, has an extra what?
      - >-
        Human evolution shows that evolutionary changes typically occur at what
        pace?
  - source_sentence: do yellow finches change color in the winter
    sentences:
      - >-
        Hello, Dolly! (musical) The role of Dolly Levi in the musical was
        originally written for Ethel Merman, but Merman turned it down, as did
        Mary Martin (although each eventually played it).[3] Merrick then
        auditioned Nancy Walker. Eventually, he hired Carol Channing, who then
        created in Dolly her signature role.[5] Director Gower Champion was not
        the producer's first choice, as Hal Prince and others (among them Jerome
        Robbins and Joe Layton) all turned down the job of directing the
        musical.[6]
      - >-
        American goldfinch Once the spring molt is complete, the body of the
        male is a brilliant lemon yellow, a color produced by carotenoid
        pigments from plant materials in its diet,[18] with a striking jet black
        cap and white rump that is visible during flight.[19] The female is
        mostly brown, lighter on the underside with a yellow bib.[17] After the
        autumn molt, the bright summer feathers are replaced by duller plumage,
        becoming buff below and olive-brown above, with a pale yellow face and
        bib. The autumn plumage is almost identical in both sexes, but the male
        has yellow shoulder patches.[20] In some winter ranges, the goldfinches
        lose all traces of yellow, becoming a predominantly medium tan-gray
        color with an olive tinge evident only on close viewing.
      - >-
        Wide boy An early use of the term was in the 1933 film "Friday the
        Thirteenth", where the character played by Max Miller, a loud,
        quick-witted, Cockney market trader, is heard to say "I'm the widest boy
        ever put on a pair of shoes!"
pipeline_tag: sentence-similarity
model-index:
  - name: SentenceTransformer based on microsoft/deberta-v3-small
    results:
      - task:
          type: semantic-similarity
          name: Semantic Similarity
        dataset:
          name: sts test
          type: sts-test
        metrics:
          - type: pearson_cosine
            value: 0.7653416933913197
            name: Pearson Cosine
          - type: spearman_cosine
            value: 0.7593865234485873
            name: Spearman Cosine
          - type: pearson_manhattan
            value: 0.7605105025260754
            name: Pearson Manhattan
          - type: spearman_manhattan
            value: 0.7515978828464567
            name: Spearman Manhattan
          - type: pearson_euclidean
            value: 0.7529907774019836
            name: Pearson Euclidean
          - type: spearman_euclidean
            value: 0.7436431053840061
            name: Spearman Euclidean
          - type: pearson_dot
            value: 0.5401711611334493
            name: Pearson Dot
          - type: spearman_dot
            value: 0.5559615063301898
            name: Spearman Dot
          - type: pearson_max
            value: 0.7653416933913197
            name: Pearson Max
          - type: spearman_max
            value: 0.7593865234485873
            name: Spearman Max

SentenceTransformer based on microsoft/deberta-v3-small

This is a sentence-transformers model finetuned from microsoft/deberta-v3-small on the nli-pairs, sts-label, vitaminc-pairs, qnli-contrastive, scitail-pairs-qa, scitail-pairs-pos, xsum-pairs, compression-pairs, sciq_pairs, qasc_pairs, openbookqa_pairs, msmarco_pairs, nq_pairs, trivia_pairs, quora_pairs and gooaq_pairs datasets. It maps sentences & paragraphs to a 768-dimensional dense vector space and can be used for semantic textual similarity, semantic search, paraphrase mining, text classification, clustering, and more.

Model Details

Model Description

Model Sources

Full Model Architecture

SentenceTransformer(
  (0): Transformer({'max_seq_length': 512, 'do_lower_case': False}) with Transformer model: DebertaV2Model 
  (1): Pooling({'word_embedding_dimension': 768, 'pooling_mode_cls_token': False, 'pooling_mode_mean_tokens': True, 'pooling_mode_max_tokens': False, 'pooling_mode_mean_sqrt_len_tokens': False, 'pooling_mode_weightedmean_tokens': False, 'pooling_mode_lasttoken': False, 'include_prompt': True})
)

Usage

Direct Usage (Sentence Transformers)

First install the Sentence Transformers library:

pip install -U sentence-transformers

Then you can load this model and run inference.

from sentence_transformers import SentenceTransformer

# Download from the 🤗 Hub
model = SentenceTransformer("bobox/DeBERTa-ST-AllLayers-testing-v3-checkpoints-tmp")
# Run inference
sentences = [
    'do yellow finches change color in the winter',
    'American goldfinch Once the spring molt is complete, the body of the male is a brilliant lemon yellow, a color produced by carotenoid pigments from plant materials in its diet,[18] with a striking jet black cap and white rump that is visible during flight.[19] The female is mostly brown, lighter on the underside with a yellow bib.[17] After the autumn molt, the bright summer feathers are replaced by duller plumage, becoming buff below and olive-brown above, with a pale yellow face and bib. The autumn plumage is almost identical in both sexes, but the male has yellow shoulder patches.[20] In some winter ranges, the goldfinches lose all traces of yellow, becoming a predominantly medium tan-gray color with an olive tinge evident only on close viewing.',
    'Wide boy An early use of the term was in the 1933 film "Friday the Thirteenth", where the character played by Max Miller, a loud, quick-witted, Cockney market trader, is heard to say "I\'m the widest boy ever put on a pair of shoes!"',
]
embeddings = model.encode(sentences)
print(embeddings.shape)
# [3, 768]

# Get the similarity scores for the embeddings
similarities = model.similarity(embeddings, embeddings)
print(similarities.shape)
# [3, 3]

Evaluation

Metrics

Semantic Similarity

Metric Value
pearson_cosine 0.7653
spearman_cosine 0.7594
pearson_manhattan 0.7605
spearman_manhattan 0.7516
pearson_euclidean 0.753
spearman_euclidean 0.7436
pearson_dot 0.5402
spearman_dot 0.556
pearson_max 0.7653
spearman_max 0.7594

Training Details

Training Datasets

nli-pairs

  • Dataset: nli-pairs at d482672
  • Size: 15,000 training samples
  • Columns: sentence1 and sentence2
  • Approximate statistics based on the first 1000 samples:
    sentence1 sentence2
    type string string
    details
    • min: 5 tokens
    • mean: 16.62 tokens
    • max: 62 tokens
    • min: 4 tokens
    • mean: 9.46 tokens
    • max: 29 tokens
  • Samples:
    sentence1 sentence2
    A person on a horse jumps over a broken down airplane. A person is outdoors, on a horse.
    Children smiling and waving at camera There are children present
    A boy is jumping on skateboard in the middle of a red bridge. The boy does a skateboarding trick.
  • Loss: AdaptiveLayerLoss with these parameters:
    {
        "loss": "GISTEmbedLoss",
        "n_layers_per_step": -1,
        "last_layer_weight": 1.5,
        "prior_layers_weight": 0.75,
        "kl_div_weight": 1.25,
        "kl_temperature": 1.1
    }
    

sts-label

  • Dataset: sts-label at ab7a5ac
  • Size: 5,749 training samples
  • Columns: sentence1, sentence2, and score
  • Approximate statistics based on the first 1000 samples:
    sentence1 sentence2 score
    type string string float
    details
    • min: 6 tokens
    • mean: 9.81 tokens
    • max: 27 tokens
    • min: 5 tokens
    • mean: 9.74 tokens
    • max: 25 tokens
    • min: 0.0
    • mean: 0.54
    • max: 1.0
  • Samples:
    sentence1 sentence2 score
    A plane is taking off. An air plane is taking off. 1.0
    A man is playing a large flute. A man is playing a flute. 0.76
    A man is spreading shreded cheese on a pizza. A man is spreading shredded cheese on an uncooked pizza. 0.76
  • Loss: CoSENTLoss with these parameters:
    {
        "scale": 20.0,
        "similarity_fct": "pairwise_cos_sim"
    }
    

vitaminc-pairs

  • Dataset: vitaminc-pairs at be6febb
  • Size: 15,000 training samples
  • Columns: label, sentence1, and sentence2
  • Approximate statistics based on the first 1000 samples:
    label sentence1 sentence2
    type int string string
    details
    • 1: 100.00%
    • min: 7 tokens
    • mean: 16.94 tokens
    • max: 88 tokens
    • min: 10 tokens
    • mean: 39.24 tokens
    • max: 502 tokens
  • Samples:
    label sentence1 sentence2
    1 Fantastic Four has a rating below 5 % . On Rotten Tomatoes , the film holds an approval rating of 4 % based on , with a weighted average rating of 3.5/10 .
    1 The Proclaimers were guests on the Jeremy Kyle show . They are best known for their appearance on Jeremy Kyle where it was proved the duo were not actually brothers , they are also known for songs I 'm Gon na Be ( 500 Miles ) '' , Sunshine on Leith '' , I 'm On My Way '' and Letter from America '' , and their singing style with a Scottish accent .
    1 Maurice Harkless was interested in working with Puerto Rico former coach Paco Olmos . On January 29 , 2014 , Harkless declared his interest in playing for Puerto Rico at the 2014 FIBA Basketball World Cup following a series of reunions with Puerto Rico former coach Paco Olmos .
  • Loss: AdaptiveLayerLoss with these parameters:
    {
        "loss": "GISTEmbedLoss",
        "n_layers_per_step": -1,
        "last_layer_weight": 1.5,
        "prior_layers_weight": 0.75,
        "kl_div_weight": 1.25,
        "kl_temperature": 1.1
    }
    

qnli-contrastive

  • Dataset: qnli-contrastive at bcdcba7
  • Size: 15,000 training samples
  • Columns: sentence1, sentence2, and label
  • Approximate statistics based on the first 1000 samples:
    sentence1 sentence2 label
    type string string int
    details
    • min: 6 tokens
    • mean: 13.75 tokens
    • max: 41 tokens
    • min: 6 tokens
    • mean: 35.51 tokens
    • max: 180 tokens
    • 0: 100.00%
  • Samples:
    sentence1 sentence2 label
    Can real stacking be accomplished? A mark handled this way will appear over whatever character precedes it, but will not adjust its position relative to the width or height of the base glyph; it may be visually awkward and it may overlap some glyphs. 0
    Drug-resistant TB is one of the barriers to success of the Stop TB Partnership's initiative; what's the other other? The World Health Organization declared TB a "global health emergency" in 1993, and in 2006, the Stop TB Partnership developed a Global Plan to Stop Tuberculosis that aims to save 14 million lives between its launch and 2015. 0
    In what year did the king demand ale-sellers post signage on pain of forfeiture? In 1393 King Richard II compelled landlords to erect signs outside their premises. 0
  • Loss: AdaptiveLayerLoss with these parameters:
    {
        "loss": "OnlineContrastiveLoss",
        "n_layers_per_step": -1,
        "last_layer_weight": 0.25,
        "prior_layers_weight": 2.5,
        "kl_div_weight": 5,
        "kl_temperature": 0.25
    }
    

scitail-pairs-qa

  • Dataset: scitail-pairs-qa at 0cc4353
  • Size: 14,537 training samples
  • Columns: sentence2 and sentence1
  • Approximate statistics based on the first 1000 samples:
    sentence2 sentence1
    type string string
    details
    • min: 7 tokens
    • mean: 15.76 tokens
    • max: 41 tokens
    • min: 7 tokens
    • mean: 15.02 tokens
    • max: 34 tokens
  • Samples:
    sentence2 sentence1
    We call the solid form of hydrocarbons coal. What do we call the solid form of hydrocarbons?
    Blood flow decrease when blood vessels constrict. Does blood flow increase or decrease when blood vessels constrict?
    Exact wavelength determines the color of visible light. What determines the color of visible light?
  • Loss: AdaptiveLayerLoss with these parameters:
    {
        "loss": "GISTEmbedLoss",
        "n_layers_per_step": -1,
        "last_layer_weight": 1.5,
        "prior_layers_weight": 0.75,
        "kl_div_weight": 1.25,
        "kl_temperature": 1.1
    }
    

scitail-pairs-pos

  • Dataset: scitail-pairs-pos at 0cc4353
  • Size: 8,600 training samples
  • Columns: sentence1 and sentence2
  • Approximate statistics based on the first 1000 samples:
    sentence1 sentence2
    type string string
    details
    • min: 8 tokens
    • mean: 23.09 tokens
    • max: 64 tokens
    • min: 7 tokens
    • mean: 15.55 tokens
    • max: 35 tokens
  • Samples:
    sentence1 sentence2
    Light waves bend when passing diagonally from one material to another because the speed of light changes slightly according to the density of the material it is traversing. When light passes from one medium to another, it changes speed.
    Weight is the force exerted on the object by gravity. Weight is the term for the measure of the force of gravity pulling down on an object.
    Then, if carbon dioxide is formed of one atom of carbon and two atoms of oxygen, the proportion must naturally consist of 3 parts of carbon to 8 of oxygen. Carbon dioxide molecules consist of a central carbon atom bonded to two oxygen atoms.
  • Loss: AdaptiveLayerLoss with these parameters:
    {
        "loss": "GISTEmbedLoss",
        "n_layers_per_step": -1,
        "last_layer_weight": 1.5,
        "prior_layers_weight": 0.75,
        "kl_div_weight": 1.25,
        "kl_temperature": 1.1
    }
    

xsum-pairs

  • Dataset: xsum-pairs at 788ddaf
  • Size: 4,750 training samples
  • Columns: sentence1 and sentence2
  • Approximate statistics based on the first 1000 samples:
    sentence1 sentence2
    type string string
    details
    • min: 14 tokens
    • mean: 349.39 tokens
    • max: 512 tokens
    • min: 8 tokens
    • mean: 27.01 tokens
    • max: 67 tokens
  • Samples:
    sentence1 sentence2
    Kerry Smith was selected for the South Basildon and Thurrock seat this week, after former Tory MP Neil Hamilton pulled out of the contest.
    In a recording obtained by the Mail On Sunday, he is heard making offensive remarks about gay people, other UKIP members and Chigwell in Essex.
    He later issued a "wholehearted and unreserved apology".
    UKIP said the phone call was made some time ago, when Mr Smith had been prescribed sedatives after an injury.
    During the recorded conversation, Mr Smith talks about UKIP's lesbian, gay, bisexual and transgender (LGBT) group and he can be heard jokingly referring to it as BLT UKIP, and adds "what the old poofter groups call themselves".
    He jokes about "shooting peasants" from the Essex town of Chigwell and supporting "a peasant's hunt through Chigwell village".
    Last week Mr Smith was chosen as UKIP's candidate for South Basildon and East Thurrock, a seat in which the party hopes to make a serious challenge.
    A UKIP spokesman confirmed to the BBC that Mr Smith had apologised to the party's leader Nigel Farage for allegations made against him during the phone call - which Mr Smith has since retracted.
    Patrick O'Flynn, UKIP MEP for the East of England, told BBC One's Sunday Politics that the phone call had been made "some time ago while he was on sedatives" and he had not been "speaking and thinking rationally".
    He said the party's candidates had to "watch how to express themselves" adding: "What many people call political correctness is often just politeness and using derogatory terminology, pejorative slang is not right at this level of politics and you shouldn't do it."
    He said Mr Smith was not homophobic but needed to "learn to express himself more respectfully about minorities of all kinds".
    He pointed out that this week other parties had suffered gaffes by members - with Labour MP Frank Doran apologising for suggesting the post of fisheries minister was not a "job for a woman", while Conservative peer Baroness Jenkin of Kennington apologised for saying "poor people don't know how to cook".
    "The hand grenades are rolling down the corridor. We're still way up in the polls, we've had a fantastic year, we've won two by-elections."
    "He's a young man he's learning politics - we also have to have a balance, we don't want to become so anodyne speaking in such non-colloquial language that we lose touch, and I think some other parties risk doing that.
    "But clearly what he has said there is unacceptable - he's apologised unreservedly there. There are big mitigating circumstances here, it was from some time ago, and so we are willing to judge him on his performance from now on."
    In a statement made by a UKIP spokesman on his behalf, Mr Smith said: "I wish to issue a wholehearted and unreserved apology to those who I have offended within the party and anyone else.
    "With regards to the leadership and management of the party I was completely wrong and my comments were fuelled by frustrations."
    Former Conservative MP Neil Hamilton pulled out of the selection contest for South Basildon and East Thurrock after a letter from UKIP's finance committee was leaked to Channel 4 News querying Mr Hamilton's expenses claims for the party.
    The former MP has suggested there is a "dirty tricks" campaign against him.
    A report in the Financial Times said party officials were accusing one of its biggest funders of trying to pressure them into accepting Mr Hamilton's candidacy. He previously pulled out of another selection process, in Boston and Skegness.
    A UKIP candidate for one of the party's target seats has apologised for offensive remarks made in a phone call.
    Thousands of gaming fans voted for their favourite games, and the top 15 finalists have been revealed.
    The award was created to celebrate games from all platforms including arcade, console, computer, handheld, and mobile.
    Games had to pass four categories:
    Icon-status, longevity, geographical reach and influence.
    This year's winners will be announced at a ceremony on the 4th June.
    To celebrate the World Video Game Hall of Fame we've picked out some of the finalists for a closer look.
    So put down your controllers and enjoy our guide to some of the best games of all time!
    If you've never heard of Minecraft you must have been living in outer space! The games popularity has grown and grown since it was first launched in 2009.
    Players use pixelated blocks to create detailed buildings and worlds. Minecraft became so popular techno giant Microsoft bought it from its original creators last year for £1.5 billion.
    As of 2014, more than 54 million copies of the game have been sold.
    And you've even told us why you think it is so popular.
    Sonic and Mario are two of the heavyweights in the gaming world, and have a long history of rivalry.
    Super Mario Bros was created by Nintendo and came out in 1985.
    The game featured an Italian plumber with a big moustache called Mario and became so popular it launched a number of spin off games that lasted generations.
    More than 509 million copies of the various Mario games have been sold worldwide.
    Sonic the Hedgehog was launched in 1991 by Sega, and featured a speedy blue hedgehog, who liked to collect gold rings.
    At one point Sonic was so popular, children in America were able to identify him over other characters like Mickey Mouse or the President.
    The rivalry between Mario and Sonic came to an end in 2007, when they both appeared in Mario and Sonic at the Olympic Games.
    Find out more about the history of Nintendo
    Another huge hit from the Nintendo universe is Pokémon.
    The game first came out on the Game Boy in 1996, called Pocket Monsters, and was created by Japanese developers Game Freak.
    Gamers play as a Pokémon trainer who can catch and battle a large number of different Pokémon.
    Since Pokémon was released it has become the third most-popular franchise worldwide after Mario and Super Mario.
    As of 2014, the Pokémon series had sold more than 260 million copies of its games, around 21.5 billion trading cards, and has created more than 800 television episodes and 17 movies.
    Find out how Pokemon became a global hit
    The FIFA series has become one of the most popular sports game franchises in the world since it was first released by Electronic Arts in 1993.
    It allows gamers to play as their favourite football teams from different leagues all over the world.
    FIFA 12 holds the record for the "fastest-selling sports game ever" selling over 3.2 million copies in the first week of its release.
    First released in 2009 by Finnish developers Rovio Entertainment, Angry Birds became the first ever mobile game to achieve worldwide fame and popularity.
    Players use a giant slingshot to catapult various bird characters at a number of different structures, in an attempt to knock everything over.
    Angry Birds has expanded into a number of different consoles and has even joined up with Sony Pictures to create a film.
    Games in the Angry Birds series have been downloaded more than two billion times.
    The granddaddy of video games, Pong was created in 1972 and is widely viewed as being the first ever "video game".
    Pong is essentially a simple tennis style game, where players have to keep a rally going, or score as many points against their opponent as possible.
    The game was created using simple 2D, black and white graphics, and was made by the company Atari.
    As Pong became popular it encouraged Atari to design more games, and encouraged many other developers to create new games too.
    The famous pixel block game was originally designed by a Russian programmer called Alexey Pajitnov in 1984.
    Tetris was one of the original games for the Game Boy.
    Players have to fit different coloured and shaped blocks together.
    Tetris is available on almost every gaming platform, and as of 2010 has sold more than 170 million copies worldwide.
    Top gaming experts are voting on the best video games to go into the World Video Game Hall of Fame, in America.
    India's western state of Gujarat certainly believes so. Earlier this week, the state's legislators passed a bill which makes it mandatory for candidates to have toilets in their homes to qualify for contesting elections to local municipalities and village councils. Existing elected members will also have to declare within six months that they have toilets at home, failing which they will face disqualification.
    Prime Minister Narendra Modi, who ruled Gujarat for over a decade before he swept to power in Delhi in May, has made abolishing open defecation a top priority of his government. It is a laudable aim, though critics believe it does not appear to link what is largely an individual-driven campaign to the appalling practice of manual scavenging. Clearly legislators belonging to Mr Modi's ruling BJP in Gujarat have enthusiastically backed their leader's call.
    Surely, there is nothing wrong in that. Open defecation blights the lives of millions of Indians and is an enduring health hazard. Nearly half of Indians continue to defecate in the open. Gujarat, one of India's most prosperous states, is in a hurry to build more toilets; the state has a spotty record here. Its new Chief Minister Anandi Patel says she wants the state to be "open defecation free" in two years. A recent report said more than 70,000 people defecate in the open in the main city of Ahmedabad alone. Good economics does not always lead to good sanitation.
    But is the latest move linking a democratic right to building a private utility such a good idea?
    Some 40% of people in Gujarat live in its 159 municipalities and eight municipal corporation areas in what is one of India's most urbanised states. There are some 13,500 village councils in its more than 18,500 villages. Elections to these bodies are critical to the health of Gujarat's democracy and development. The freedom to contest the polls is also an inalienable right of every citizen living in their cities and villages.
    That is why critics like economist Hemant Shah feel that the bill is essentially "undemocratic and discriminatory", and should be challenged in the courts.
    Tens of thousands of people in Gujarat's teeming cities live in sprawling chawls - densely packed buildings with more than a dozen tenements - where many families share a single toilet. Will a chawl resident be barred from contesting because he does not have his private toilet? What happens to the political aspirations of a resident of a grubby shantytown home so small that his living space is sometimes equal to the non-existent toilet?
    "The government should first provide space and money to build toilets for the poor. The poor are most affected by urban planning because it has always excluded them. Now they can't dream from standing for public office just because they don't have the space or money to build their own toilets?" asks Professor Shah. It's a valid question.
    Is banning a person from contesting for public office if he or she does not have a toilet at home a good idea?
  • Loss: AdaptiveLayerLoss with these parameters:
    {
        "loss": "MultipleNegativesSymmetricRankingLoss",
        "n_layers_per_step": -1,
        "last_layer_weight": 0.75,
        "prior_layers_weight": 2,
        "kl_div_weight": 1,
        "kl_temperature": 0.5
    }
    

compression-pairs

  • Dataset: compression-pairs at 605bc91
  • Size: 14,550 training samples
  • Columns: sentence1 and sentence2
  • Approximate statistics based on the first 1000 samples:
    sentence1 sentence2
    type string string
    details
    • min: 10 tokens
    • mean: 31.04 tokens
    • max: 372 tokens
    • min: 5 tokens
    • mean: 10.01 tokens
    • max: 25 tokens
  • Samples:
    sentence1 sentence2
    The Canadian dollar continued its meteoric rise Wednesday to as high as US98.91¢ before the market open, edging ever closer to parity. Canadian dollar edges closer to parity
    NFL Network insider Jason La Canfora reports Anderson has agreed to terms with the Cardinals, according to a league source. Anderson agrees to terms with Cardinals
    Churchill Downs said Monday it will increase its annual dividend 20 percent to 60 cents per share.The Louisville, Ky., racetrack operator and gambling company said it raised the dividend to reflect its strong financial results so far this year. Churchill Downs increases annual dividend 20 percent
  • Loss: AdaptiveLayerLoss with these parameters:
    {
        "loss": "MultipleNegativesSymmetricRankingLoss",
        "n_layers_per_step": -1,
        "last_layer_weight": 0.75,
        "prior_layers_weight": 2,
        "kl_div_weight": 1,
        "kl_temperature": 0.5
    }
    

sciq_pairs

  • Dataset: sciq_pairs at 2c94ad3
  • Size: 11,328 training samples
  • Columns: sentence1 and sentence2
  • Approximate statistics based on the first 1000 samples:
    sentence1 sentence2
    type string string
    details
    • min: 5 tokens
    • mean: 16.97 tokens
    • max: 54 tokens
    • min: 2 tokens
    • mean: 85.2 tokens
    • max: 512 tokens
  • Samples:
    sentence1 sentence2
    How do the vast majority of fish reproduce? Nearly all fish reproduce sexually, and most species have separate sexes. Those without separate sexes avoid self-fertilization by producing sperm and eggs at different times. Each fish typically produces a large number of gametes. In most fish species, fertilization takes place externally. These fish are oviparous . Eggs are laid and embryos develop outside the mother’s body. In a minority of fish, including sharks, eggs develop inside the mother’s body but without nourishment from the mother. These fish are ovoviviparous .
    What is the loss of energy available to do work called? 15.6 Entropy and the Second Law of Thermodynamics: Disorder and the Unavailability of Energy • Entropy is the loss of energy available to do work. • Another form of the second law of thermodynamics states that the total entropy of a system either increases or remains constant; it never decreases. • Entropy is zero in a reversible process; it increases in an irreversible process. • The ultimate fate of the universe is likely to be thermodynamic equilibrium, where the universal temperature is constant and no energy is available to do work. • Entropy is also associated with the tendency toward disorder in a closed system.
    How many vertebrae make up the human vertebral column? Human Vertebral Column and Vertebrae. The human vertebral column consists of 33 vertebrae. Two vertebrae are shown here enlarged.
  • Loss: AdaptiveLayerLoss with these parameters:
    {
        "loss": "GISTEmbedLoss",
        "n_layers_per_step": -1,
        "last_layer_weight": 1.5,
        "prior_layers_weight": 0.75,
        "kl_div_weight": 1.25,
        "kl_temperature": 1.1
    }
    

qasc_pairs

  • Dataset: qasc_pairs at a34ba20
  • Size: 7,889 training samples
  • Columns: id, sentence1, and sentence2
  • Approximate statistics based on the first 1000 samples:
    id sentence1 sentence2
    type string string string
    details
    • min: 17 tokens
    • mean: 21.27 tokens
    • max: 27 tokens
    • min: 4 tokens
    • mean: 11.28 tokens
    • max: 23 tokens
    • min: 16 tokens
    • mean: 34.86 tokens
    • max: 68 tokens
  • Samples:
    id sentence1 sentence2
    3KMS4QQVK2P724SORHWYGW4AU78KFR What is in fruit? sugar causes food to taste sweet. Fruit is delicious and very sweet.. sugar is in fruit
    3ATTHHXXWANXWVTLR8H89NP4TSOXIK What can lead to cancer in genes that control the cell cycle? Mutations that lead to cancer usually occur in genes that control the cell cycle.. Many carcinogens are capable of causing gene mutations.. Carcinogens can lead to cancer in genes that control the cell cycle
    3EQHHY4HQSRAYL3GVEYAWSL4O5BG57 what do plants lose to the atmosphere? transpiration is when water vapor moves from plants into the atmosphere. Plants lose water continually by transpiration.. plants lose water to the atmosphere
  • Loss: AdaptiveLayerLoss with these parameters:
    {
        "loss": "GISTEmbedLoss",
        "n_layers_per_step": -1,
        "last_layer_weight": 1.5,
        "prior_layers_weight": 0.75,
        "kl_div_weight": 1.25,
        "kl_temperature": 1.1
    }
    

openbookqa_pairs

  • Dataset: openbookqa_pairs at 388097e
  • Size: 2,637 training samples
  • Columns: sentence1 and sentence2
  • Approximate statistics based on the first 1000 samples:
    sentence1 sentence2
    type string string
    details
    • min: 3 tokens
    • mean: 13.8 tokens
    • max: 65 tokens
    • min: 4 tokens
    • mean: 11.24 tokens
    • max: 30 tokens
  • Samples:
    sentence1 sentence2
    a toaster converts electricity into radiant waves for a toaster converts electrical energy into heat energy for toasting
    A person loves spring, and it has just passed by. They will enjoy it again the next time each season occurs once per year
    a student leaves a nail line on a mineral sample, so that mineral can be described as what? if a mineral can be scratched by a fingernail then that mineral is soft
  • Loss: AdaptiveLayerLoss with these parameters:
    {
        "loss": "GISTEmbedLoss",
        "n_layers_per_step": -1,
        "last_layer_weight": 1.5,
        "prior_layers_weight": 0.75,
        "kl_div_weight": 1.25,
        "kl_temperature": 1.1
    }
    

msmarco_pairs

  • Dataset: msmarco_pairs at 28ff31e
  • Size: 14,550 training samples
  • Columns: sentence1 and sentence2
  • Approximate statistics based on the first 1000 samples:
    sentence1 sentence2
    type string string
    details
    • min: 4 tokens
    • mean: 8.58 tokens
    • max: 32 tokens
    • min: 18 tokens
    • mean: 76.09 tokens
    • max: 205 tokens
  • Samples:
    sentence1 sentence2
    when was remy ma born Overview (4). Remy Ma was born on May 30, 1980 in South Bronx, The Bronx, New York City, New York, USA as Remy K. Smith. She has been married to Papoose since May 12, 2009.
    what causes bladder pain Bladder pain may be caused by a number of conditions, including a urinary tract infection. A urine sample can be used to detect a bladder infection. Several forms of cancer may result in pain in the bladder. Pain is a common symptom of bladder tumor growths. The human urinary tract, including the bladder in pink at the bottom.
    what is .shp SHP for Agencies. SHP for Home Health Agencies (or simply SHP for Agencies) is a web-based analytics and benchmarking solution that gives home health organizations the power to effectively manage performance, stay compliant, and follow best practices. The SHP for Agencies solution helps your organization:
  • Loss: AdaptiveLayerLoss with these parameters:
    {
        "loss": "GISTEmbedLoss",
        "n_layers_per_step": -1,
        "last_layer_weight": 1.5,
        "prior_layers_weight": 0.75,
        "kl_div_weight": 1.25,
        "kl_temperature": 1.1
    }
    

nq_pairs

  • Dataset: nq_pairs at f9e894e
  • Size: 14,550 training samples
  • Columns: sentence1 and sentence2
  • Approximate statistics based on the first 1000 samples:
    sentence1 sentence2
    type string string
    details
    • min: 9 tokens
    • mean: 11.83 tokens
    • max: 21 tokens
    • min: 16 tokens
    • mean: 134.32 tokens
    • max: 512 tokens
  • Samples:
    sentence1 sentence2
    what is the panthers name in disney's the jungle book Bagheera In Disney's 1967 animated adaptation, Bagheera the panther is, as in the book, male, and voiced by Sebastian Cabot. The panther is portrayed as an intelligent, mature, and logical character, quite similar to the Bagheera in the books. In the film, it is Bagheera and not the wolves who first finds Mowgli, a young village child. It is Bagheera who brings Mowgli to the care of the wolves and ensures that the baby survives. He is also the one who takes him back to the village, for his own safety, as he knew for years that Mowgli would eventually need to leave his adoptive animal family to return to his place in the human world. During the film, Bagheera often lectures Baloo, for he knows that as long as Shere Khan is in the jungle, the jungle is not safe for Mowgli despite all of Baloo's attempts to protect him. Bagheera is also the narrator of the film's story.
    one unit is equal to how many ml Unit of alcohol One unit of alcohol (UK) is defined as 10 millilitres (8 grams) of pure alcohol.[2][3] Typical drinks (i.e., typical quantities or servings of common alcoholic beverages) may contain 1–3 units of alcohol.[3]
    who sings can't get you outta my head Can't Get You Out of My Head "Can't Get You Out of My Head" is a song recorded by Australian singer Kylie Minogue for her eighth studio album, titled Fever, which she released in 2001. The song was released in Australia by Parlophone as the lead single from the album on 8 September 2001. It was released on 17 September 2001 in the United Kingdom. In the United States, the single was released on 18 February 2002. Jointly written, composed, and produced by Cathy Dennis and Rob Davis, "Can't Get You Out of My Head" is a midtempo dance-pop song which lyrically details its narrator's obsession towards her lover. The song is famous for its "la la la" hook.
  • Loss: AdaptiveLayerLoss with these parameters:
    {
        "loss": "GISTEmbedLoss",
        "n_layers_per_step": -1,
        "last_layer_weight": 1.5,
        "prior_layers_weight": 0.75,
        "kl_div_weight": 1.25,
        "kl_temperature": 1.1
    }
    

trivia_pairs

  • Dataset: trivia_pairs at a7c36e3
  • Size: 14,550 training samples
  • Columns: sentence1 and sentence2
  • Approximate statistics based on the first 1000 samples:
    sentence1 sentence2
    type string string
    details
    • min: 8 tokens
    • mean: 17.05 tokens
    • max: 43 tokens
    • min: 21 tokens
    • mean: 446.31 tokens
    • max: 512 tokens
  • Samples:
    sentence1 sentence2
    Main is French for which part of the body? Les parties du corps - Des os, il en faut - alain le lait (French body parts) - YouTube Les parties du corps - Des os, il en faut - alain le lait (French body parts) Want to watch this again later? Sign in to add this video to a playlist. Need to report the video? Sign in to report inappropriate content. Rating is available when the video has been rented. This feature is not available right now. Please try again later. Uploaded on Oct 30, 2011 Des os, il en faut - alain le lait du CD 'Parapluie' ©2006 Une chanson sur les parties du corps Words and english translation Tu as deux mains et deux pieds Tu as deux jambes et un nez Tu as un ventre et un dos Et des muscles sous la peau Tu as une tête et un cou Deux oreilles et deux genoux Tu as deux yeux et deux joues Et une bouche qui mange tout, et Sous ta peau il y a des os Des petits et des gros Des os, des os, il en faut C'est parce que tu as des os que ... Bones, you must have them You have two hands and two feet You have two legs and a nose You have a belly (stomach) and a back And muscles underneath your skin You have a head and a neck Two ears and two knees You have two eyes and two cheeks And a mouth that eats everything and Under your skin you have bones Small bones and big ones Bones, bones, you must have them It's because you have bones that ... (repeat from top of the song) Category
    Ainsley Harriot was once head chef at which UK cricket ground? Ainsley Harriott - Awards Hosts & Presenter - Speakers Corner Ainsley Harriott Biography Celebrity chef and Ready Steady Cook presenter Ainsley Harriott is quick-witted, charismatic and a consummate professional. Although singing and performing was his first love (he co-founded the Calypso Twins in the early 90’s) Ainsley’s cooking career began when he was offered an apprenticeship at an East End restaurant at the age of 16. After years of hard work in the kitchen, Ainsley rose to Head Chef position at Lord's Cricket Ground's Long Room. He worked as a chef in many restaurants in London including the Dorchester, Brown's, The Hilton, The Westbury, Café Pelican and Quaglino's. Simultaneously his foray into TV and radio began. While at Lords he was asked to present More Nosh, Less Dosh on BBC Radio 5, and he then secured a small role in sci-fi comedy Red Dwarf in 1993, and eventually became resident chef on Good Morning with Anne and Nick. Once Ainsley... Celebrity chef and Ready Steady Cook presenter Ainsley Harriott is quick-witted, charismatic and a consummate professional. Although singing and performing was his first love (he co-founded the Calypso Twins in the early 90’s) Ainsley’s cooking career began when he was offered an apprenticeship at an East End restaurant at the age of 16. After years of hard work in the kitchen, Ainsley rose to Head Chef position at Lord's Cricket Ground's Long Room. He worked as a chef in many restaurants in London including the Dorchester, Brown's, The Hilton, The Westbury, Café Pelican and Quaglino's. Simultaneously his foray into TV and radio began. While at Lords he was asked to present More Nosh, Less Dosh on BBC Radio 5, and he then secured a small role in sci-fi comedy Red Dwarf in 1993, and eventually became resident chef on Good Morning with Anne and Nick. Once Ainsley became the main presenter of Can't Cook, Won't Cook and later Ready, Steady, Cook, he was a household name. Ainsley's Barbecue Bible; Ainsley's Meals in Minutes; Ainsley's Big Cook Out and Ainsley's Gourmet Express all followed, and he became one of the most famous TV cooking faces in the country. In 2000, Ainsley made his US TV debut with The Ainsley Harriott Show and then Ready.. Set... Cook!, the US version of Ready Steady Cook. For further information or to book Ainsley Harriott, call us at Speakers Corner on +44 (0)20 7607 7070 or email info@speakerscorner.co.uk
    Who wrote the 1973 novel ‘The Dressmaker’? The Dressmaker: Beryl Bainbridge: 9780786703227: Amazon.com: Books Beryl Bainbridge Next Special Offers and Product Promotions From AudioFile English actor and audiobook reader Jacqueline King performs this thickly British story with the skill necessary to enliven five distinct characters and stitch them all together through the lucid prose of the novel's guiding narrator. In that the story is beautifully constructed to begin with, the listener is in for a fine artistic experience. The setting is Liverpool, 1944. The war pressures naïve teenaged Rita to dream beyond the fortified shores of her own country. The town is full of Yanks who come from the land of Hollywood. Rita claims one for herself, but her two aunts, who have raised her, see more and less in him than Rita suspects. The ending is inspired and in itself gives reason why this book was runner-up for the Booker Prize. The recording quality is hissy (muffled with Dolby), but it strangely adds to the atmosphere if one knows how radios used to sound during those dark, uncertain days. P.W. Winner of AudioFile Earphones Award © AudioFile 2002, Portland, Maine-- Copyright © AudioFile, Portland, Maine --This text refers to the Hardcover edition. Don't have a Kindle? Get your Kindle here , or download a FREE Kindle Reading App . New York Times best sellers Browse the New York Times best sellers in popular categories like Fiction, Nonfiction, Picture Books and more. See more Product Details Publisher: Carroll & Graf Publishers (July 1996) Language: English Product Dimensions: 8.2 x 5.4 x 0.4 inches Shipping Weight: 6.4 ounces By Cariola VINE VOICE on December 12, 2016 Format: Kindle Edition
  • Loss: AdaptiveLayerLoss with these parameters:
    {
        "loss": "GISTEmbedLoss",
        "n_layers_per_step": -1,
        "last_layer_weight": 1.5,
        "prior_layers_weight": 0.75,
        "kl_div_weight": 1.25,
        "kl_temperature": 1.1
    }
    

quora_pairs

  • Dataset: quora_pairs at 451a485
  • Size: 14,550 training samples
  • Columns: sentence1 and sentence2
  • Approximate statistics based on the first 1000 samples:
    sentence1 sentence2
    type string string
    details
    • min: 6 tokens
    • mean: 13.56 tokens
    • max: 43 tokens
    • min: 6 tokens
    • mean: 13.58 tokens
    • max: 44 tokens
  • Samples:
    sentence1 sentence2
    Do babies dream? Do babies dream while they are sleeping?
    What is the point of being married? What are some arguments for getting married?
    How do you boost your mobile signal strength? How can you increase mobile signal strength?
  • Loss: AdaptiveLayerLoss with these parameters:
    {
        "loss": "GISTEmbedLoss",
        "n_layers_per_step": -1,
        "last_layer_weight": 1.5,
        "prior_layers_weight": 0.75,
        "kl_div_weight": 1.25,
        "kl_temperature": 1.1
    }
    

gooaq_pairs

  • Dataset: gooaq_pairs at b089f72
  • Size: 14,550 training samples
  • Columns: sentence1 and sentence2
  • Approximate statistics based on the first 1000 samples:
    sentence1 sentence2
    type string string
    details
    • min: 8 tokens
    • mean: 11.38 tokens
    • max: 21 tokens
    • min: 16 tokens
    • mean: 58.0 tokens
    • max: 143 tokens
  • Samples:
    sentence1 sentence2
    when can baby sit in jumperoo? The best age for babies to use jumperoos depends on your own baby, how well they're able to hold their head up, how much upper body support they need, and the product you're using. However, we'd say don't put any baby in a jumperoo before they're 4 months old – just to be on the safe side.
    what are the three main categories of mutations? There are three types of DNA Mutations: base substitutions, deletions and insertions. Single base substitutions are called point mutations, recall the point mutation Glu -----> Val which causes sickle-cell disease. Point mutations are the most common type of mutation and there are two types.
    is vpn not working in uae? The UAE actually has laws related to the use of VPNs. Indeed, UAE law says that a VPN is only illegal if it's used to commit a crime. The Telecom Regulatory Authority (TRA) is responsible for internet censorship in the UAE.
  • Loss: AdaptiveLayerLoss with these parameters:
    {
        "loss": "GISTEmbedLoss",
        "n_layers_per_step": -1,
        "last_layer_weight": 1.5,
        "prior_layers_weight": 0.75,
        "kl_div_weight": 1.25,
        "kl_temperature": 1.1
    }
    

Evaluation Datasets

nli-pairs

  • Dataset: nli-pairs at d482672
  • Size: 150 evaluation samples
  • Columns: anchor and positive
  • Approximate statistics based on the first 1000 samples:
    anchor positive
    type string string
    details
    • min: 5 tokens
    • mean: 17.17 tokens
    • max: 36 tokens
    • min: 5 tokens
    • mean: 9.51 tokens
    • max: 21 tokens
  • Samples:
    anchor positive
    Two women are embracing while holding to go packages. Two woman are holding packages.
    Two young children in blue jerseys, one with the number 9 and one with the number 2 are standing on wooden steps in a bathroom and washing their hands in a sink. Two kids in numbered jerseys wash their hands.
    A man selling donuts to a customer during a world exhibition event held in the city of Angeles A man selling donuts to a customer.
  • Loss: AdaptiveLayerLoss with these parameters:
    {
        "loss": "GISTEmbedLoss",
        "n_layers_per_step": -1,
        "last_layer_weight": 1.5,
        "prior_layers_weight": 0.75,
        "kl_div_weight": 1.25,
        "kl_temperature": 1.1
    }
    

vitaminc-pairs

  • Dataset: vitaminc-pairs at be6febb
  • Size: 150 evaluation samples
  • Columns: label, sentence1, and sentence2
  • Approximate statistics based on the first 1000 samples:
    label sentence1 sentence2
    type int string string
    details
    • 1: 100.00%
    • min: 8 tokens
    • mean: 16.8 tokens
    • max: 53 tokens
    • min: 9 tokens
    • mean: 36.4 tokens
    • max: 145 tokens
  • Samples:
    label sentence1 sentence2
    1 Oughterard is one of the Catholic parishes which form Connemara . Connemara is composed of the Catholic parishes of Carna , Clifden ( Omey and Ballindoon ) , Ballynakill , Roundstone , Oughterard and Inishbofin .
    1 Miroslav Klose retired in August 2014 . He was the highest male international scorer among active players following Miroslav Klose 's retirement in August 2014 .
    1 The film Office Space made under $ 12.9 million against a budget of more than $ 9 million . Although not a big success at the box office , making $ 12.8 million against a $ 10 million budget , the film was well received by critics and sold well on home video , and has become a cult film.
  • Loss: AdaptiveLayerLoss with these parameters:
    {
        "loss": "GISTEmbedLoss",
        "n_layers_per_step": -1,
        "last_layer_weight": 1.5,
        "prior_layers_weight": 0.75,
        "kl_div_weight": 1.25,
        "kl_temperature": 1.1
    }
    

qnli-contrastive

  • Dataset: qnli-contrastive at bcdcba7
  • Size: 150 evaluation samples
  • Columns: sentence1, sentence2, and label
  • Approximate statistics based on the first 1000 samples:
    sentence1 sentence2 label
    type string string int
    details
    • min: 7 tokens
    • mean: 14.44 tokens
    • max: 29 tokens
    • min: 4 tokens
    • mean: 37.64 tokens
    • max: 115 tokens
    • 0: 100.00%
  • Samples:
    sentence1 sentence2 label
    What came into force after the new constitution was herald? As of that day, the new constitution heralding the Second Republic came into force. 0
    What is the first major city in the stream of the Rhine? The most important tributaries in this area are the Ill below of Strasbourg, the Neckar in Mannheim and the Main across from Mainz. 0
    What is the minimum required if you want to teach in Canada? In most provinces a second Bachelor's Degree such as a Bachelor of Education is required to become a qualified teacher. 0
  • Loss: AdaptiveLayerLoss with these parameters:
    {
        "loss": "OnlineContrastiveLoss",
        "n_layers_per_step": -1,
        "last_layer_weight": 0.25,
        "prior_layers_weight": 2.5,
        "kl_div_weight": 5,
        "kl_temperature": 0.25
    }
    

scitail-pairs-qa

  • Dataset: scitail-pairs-qa at 0cc4353
  • Size: 150 evaluation samples
  • Columns: sentence2 and sentence1
  • Approximate statistics based on the first 1000 samples:
    sentence2 sentence1
    type string string
    details
    • min: 7 tokens
    • mean: 16.18 tokens
    • max: 30 tokens
    • min: 8 tokens
    • mean: 15.43 tokens
    • max: 32 tokens
  • Samples:
    sentence2 sentence1
    Antarctica is the only continent without amphibians. What is the only continent without amphibians?
    Air can be separated into several elements. Which of the following substances can be separated into several elements?
    Ice is the common term for water in its solid state. What is the common term for water in its solid state?
  • Loss: AdaptiveLayerLoss with these parameters:
    {
        "loss": "GISTEmbedLoss",
        "n_layers_per_step": -1,
        "last_layer_weight": 1.5,
        "prior_layers_weight": 0.75,
        "kl_div_weight": 1.25,
        "kl_temperature": 1.1
    }
    

scitail-pairs-pos

  • Dataset: scitail-pairs-pos at 0cc4353
  • Size: 150 evaluation samples
  • Columns: sentence1, sentence2, and label
  • Approximate statistics based on the first 1000 samples:
    sentence1 sentence2 label
    type string string int
    details
    • min: 7 tokens
    • mean: 23.1 tokens
    • max: 61 tokens
    • min: 8 tokens
    • mean: 15.48 tokens
    • max: 36 tokens
    • 0: ~53.33%
    • 1: ~46.67%
  • Samples:
    sentence1 sentence2 label
    An introduction to atoms and elements, compounds, atomic structure and bonding, the molecule and chemical reactions. Replace another in a molecule happens to atoms during a substitution reaction. 0
    Wavelength The distance between two consecutive points on a sinusoidal wave that are in phase; Wavelength is the distance between two corresponding points of adjacent waves called. 1
    humans normally have 23 pairs of chromosomes. Humans typically have 23 pairs pairs of chromosomes. 1
  • Loss: AdaptiveLayerLoss with these parameters:
    {
        "loss": "GISTEmbedLoss",
        "n_layers_per_step": -1,
        "last_layer_weight": 1.5,
        "prior_layers_weight": 0.75,
        "kl_div_weight": 1.25,
        "kl_temperature": 1.1
    }
    

xsum-pairs

  • Dataset: xsum-pairs at 788ddaf
  • Size: 150 evaluation samples
  • Columns: sentence1 and sentence2
  • Approximate statistics based on the first 1000 samples:
    sentence1 sentence2
    type string string
    details
    • min: 62 tokens
    • mean: 324.05 tokens
    • max: 512 tokens
    • min: 9 tokens
    • mean: 27.27 tokens
    • max: 44 tokens
  • Samples:
    sentence1 sentence2
    Media playback is not supported on this device
    The Lions started the tour of New Zealand with a scratchy victory over the Provincial Barbarians before a loss to the Blues, but recovered to record a significant 12-3 win in Christchurch.
    "It's been tough this week, there's been a lot of criticism," Gatland said.
    "People have written the tour off already after two games.
    "That's been challenging for all of us. We've had to stay strong in the group and keep the faith.
    "I hope we didn't disappoint any people tonight with the result."
    Fly-half Owen Farrell kicked four penalties for the Lions, while a heroic defensive effort managed to keep the Crusaders - who have averaged 37 points across 14 straight victories in Super Rugby - to merely a penalty.
    The Lions now face the Highlanders, New Zealand Maori and the Chiefs before the first Test against New Zealand on June 24.
    "This is great preparation for us preparing to play the best team in the world, which is the All Blacks," Gatland added.
    "It's a like a club side coming together in pre-season, getting a couple of games under its belt and you know the more time together the better you'll get.
    "This team was outstanding in training on Friday, looked sharp and I knew there would be a performance because they have had time to gel.
    "The result was pretty important for us. Tonight was another step up, but there is still a lot to work on."
    One of those areas is their finishing, after the Lions spurned a handful of opportunities to score tries against the Super Rugby side.
    "We are creating [chances], and we need to get better at [finishing]. The more time we have together, hopefully we will finish those chances."
    Gatland also confirmed tour captain Sam Warburton would be involved against the Highlanders in Dunedin next week, having recovered from a minor ankle injury.
    Full-back Stuart Hogg and centre Jonathan Davies will both undergo concussion return-to-play protocols after failing Head Injury Assessments during the game.
    "We've laid a marker down a little bit tonight, now it's a big challenge for the team that takes the field on Tuesday," Gatland said.
    Meanwhile, second-row George Kruis was part of the outstanding forward effort and feels the Lions pack has made a statement with the Test series a fortnight away.
    "We had a good contest today, and probably got the upper hand," the Saracens and England lock said.
    "There were six internationals in their pack, and we knew it was going to be a tasty game. It got a bit heated at times, but we held our own and did a good job.
    "We relish the opportunity to go toe-to-toe with a pack like that. We talk about how we want to be a brutal pack and a set-piece dominant pack, and today we showed good signs of that.
    "It's every boy's dream to play for the Lions, and to get a win like that today, hopefully we can really start to build this culture and build towards the Tests."
    British and Irish Lions head coach Warren Gatland says his players had to "keep the faith" as they prepared for Saturday's key win over the Crusaders.
    Mr Umunna, who pulled out the race himself earlier this month, said Ms Kendall was best placed to drag the party out of its "comfort zone".
    He told the New Statesman Ms Kendall had "challenged conventional wisdom" and asked tough questions about Labour's future after its defeat.
    Andy Burnham, Yvette Cooper and Mary Creagh are also standing.
    Candidates must get the support of 35 MPs by 15 June, when nominations close, in order to get on the ballot paper. The winner will be announced on 12 September.
    Ms Kendall was the first candidate to publicly declare her interest in the job after Ed Miliband's resignation.
    The shadow care services minister, who was elected to Parliament in 2010, had already won the support of shadow education secretary Tristram Hunt and shadow Europe minister Pat McFadden.
    Now Mr Umunna has said he is throwing his weight behind her and that three other MPs who were part of his short-lived leadership team - Emma Reynolds, Jonathan Reynolds and Stephen Twigg - were also doing the same.
    "In this time of change our party must move beyond its comfort zone and find new ways of realising its age-old goals of equality and freedom," he wrote in the New Statesman.
    Labour's next leader, he suggested, must embrace a "vision of a Britain in which all can get on, whose citizens are financially secure and in control of their lives and happiness - and are, collectively, secure and effective in the wider world".
    "For us, our next leader must get this vision right," he wrote.
    "On all these big subjects, Liz Kendall has asked the tough questions and started to chart a course to the answers. She has been courageous in challenging conventional wisdom. She has no compunction in moving Labour beyond our comfort zone and is determined to build a team ready to chart a route forward."
    Ms Kendall has promised a new approach to business, education and defence, claiming Labour lost the election because its policies were wrong and mistakenly believed the county had moved to the left.
    Mr Burnham has won the backing of frontbenchers Rachel Reeves, Dan Jarvis and Michael Dugher, as well as former deputy prime minister Lord Prescott. Yvette Cooper has been endorsed by Vernon Coaker and John Healey among others.
    Ms Creagh told the BBC that she was confident that she would get sufficient nominations to get on the ballot paper. "A lot of people have already made a decision but a lot of people are rightly consulting with their parties," she told Radio 4's Woman's Hour.
    While Labour could win the next election, Ms Creagh warned that the party would "cease to exist" if it took its voters for granted and did not address the separate challenges facing it in Scotland, the north of England and southern England.
    Mr Umunna pulled out of the race only days after entering, saying he was uncomfortable with media scrutiny of his family.
    Labour leadership candidate Liz Kendall has won the backing of shadow business secretary Chuka Umunna.
    The woman sustained leg and head injuries in the incident on the A12 just south of Chelmsford, at 01:30 GMT.
    A 41-year-old man from Sevenoaks, Kent, has been arrested on suspicion of drinking and driving and causing grievous bodily harm.
    The southbound carriageway was closed between junctions 16 and 15 until 07:00 GMT.
    A woman is in a critical condition after she was hit by a car on a dual carriageway in Essex.
  • Loss: AdaptiveLayerLoss with these parameters:
    {
        "loss": "MultipleNegativesSymmetricRankingLoss",
        "n_layers_per_step": -1,
        "last_layer_weight": 0.75,
        "prior_layers_weight": 2,
        "kl_div_weight": 1,
        "kl_temperature": 0.5
    }
    

compression-pairs

  • Dataset: compression-pairs at 605bc91
  • Size: 150 evaluation samples
  • Columns: sentence1 and sentence2
  • Approximate statistics based on the first 1000 samples:
    sentence1 sentence2
    type string string
    details
    • min: 11 tokens
    • mean: 32.65 tokens
    • max: 157 tokens
    • min: 5 tokens
    • mean: 10.31 tokens
    • max: 27 tokens
  • Samples:
    sentence1 sentence2
    Oil jumped $4 to record highs over $120 a barrel on Monday on the weaker US dollar and supply concerns from OPEC members Nigeria and Iran. Oil jumps $4 to record over $120 on weak dollar
    MIAMI - Hurricane Celia has weakened a bit in the Pacific but is still a Category 2 storm. Hurricane Celia weakens but still Category 2 storm
    The Wisconsin recall election is officially underway, with voters heading to the polls to decide whether or not to recall highly publicized Republican governor Scott Walker in favor Democratic Milwaukee Mayor Tom Barrett. Wisconsin recall election underway
  • Loss: AdaptiveLayerLoss with these parameters:
    {
        "loss": "MultipleNegativesSymmetricRankingLoss",
        "n_layers_per_step": -1,
        "last_layer_weight": 0.75,
        "prior_layers_weight": 2,
        "kl_div_weight": 1,
        "kl_temperature": 0.5
    }
    

sciq_pairs

  • Dataset: sciq_pairs at 2c94ad3
  • Size: 150 evaluation samples
  • Columns: sentence1 and sentence2
  • Approximate statistics based on the first 1000 samples:
    sentence1 sentence2
    type string string
    details
    • min: 7 tokens
    • mean: 17.42 tokens
    • max: 42 tokens
    • min: 2 tokens
    • mean: 84.46 tokens
    • max: 464 tokens
  • Samples:
    sentence1 sentence2
    Salt in seawater causes it to have greater what, which is also affected by temperature and pressure? Seawater has lots of salts in it. This increases its density (mass per volume) over fresh water. Temperature and pressure also affect density.
    Halogens tend to form salts with what type of element? Halogens have filled valence electron configurations. Halogens tend to form salts with metals. As the free elements, halogens are monatomic. Halogens have appreciable nonmetallic character. Halogens tend to have an oxidation state of −1. Halogens are good reductants.
    What is type of substance is formed when water vapor condenses or when ice melts? Liquid water is formed when water vapor condenses (i. e. , H 2 O(g) → H 2 O(l) or when ice melts (i. e. , H 2 O(s) → H 2 O(l)). Because water is a molecular substance, it is a poor conductor of electricity in its pure form. However, as we will see later, its conductivity can be improved by the addition of certain substances. Water molecules are polar, and this overall polarity gives rise to many of the properties of water. For example, an interesting effect is seen when water is placed in a static electric field, as shown in the Figure below and the video below. This phenomenon can be explained in terms of the polarity of water molecules.
  • Loss: AdaptiveLayerLoss with these parameters:
    {
        "loss": "GISTEmbedLoss",
        "n_layers_per_step": -1,
        "last_layer_weight": 1.5,
        "prior_layers_weight": 0.75,
        "kl_div_weight": 1.25,
        "kl_temperature": 1.1
    }
    

qasc_pairs

  • Dataset: qasc_pairs at a34ba20
  • Size: 150 evaluation samples
  • Columns: id, sentence1, and sentence2
  • Approximate statistics based on the first 1000 samples:
    id sentence1 sentence2
    type string string string
    details
    • min: 17 tokens
    • mean: 21.05 tokens
    • max: 26 tokens
    • min: 6 tokens
    • mean: 11.85 tokens
    • max: 22 tokens
    • min: 16 tokens
    • mean: 35.73 tokens
    • max: 56 tokens
  • Samples:
    id sentence1 sentence2
    3E7TUJ2EGCLQNOV1WEAJ2NN97VY9DJ Something that comes from polluted what has a negative impact on water quality? acid rain has a negative impact on water quality. Acid rain comes from polluted clouds.. Something that comes from polluted clouds has a negative impact on water quality.
    345LHZDEDXRQPOH710ZYLAOBITP3UH Plastic and other mulches offer a barrier to what? Spores may be dispersed by moving water, wind, or other organisms.. Plastic and other mulches offer a barrier to spore dispersal.. Plastic and other mulches offer a barrier to spores moving
    31JLPPHS2UTVCJXA5ENPM4WMXAI3OO What happens if Mars becomes too hot? if a planet becomes too hot then that planet cannot sustain life. Another name for Mars is the Red Planet.. If Mars becomes too hot then Mars cannot sustain life
  • Loss: AdaptiveLayerLoss with these parameters:
    {
        "loss": "GISTEmbedLoss",
        "n_layers_per_step": -1,
        "last_layer_weight": 1.5,
        "prior_layers_weight": 0.75,
        "kl_div_weight": 1.25,
        "kl_temperature": 1.1
    }
    

openbookqa_pairs

  • Dataset: openbookqa_pairs at 388097e
  • Size: 103 evaluation samples
  • Columns: sentence1 and sentence2
  • Approximate statistics based on the first 1000 samples:
    sentence1 sentence2
    type string string
    details
    • min: 4 tokens
    • mean: 12.84 tokens
    • max: 37 tokens
    • min: 5 tokens
    • mean: 11.15 tokens
    • max: 24 tokens
  • Samples:
    sentence1 sentence2
    Humans sometimes eat what? humans sometimes eat seeds
    if something moves faster than before, it might have been affected by what? force causes the speed of an object to increase
    A person wants to turn on an MP3 player, so they complete a circuit by pushing a button sometimes completes a circuit
  • Loss: AdaptiveLayerLoss with these parameters:
    {
        "loss": "GISTEmbedLoss",
        "n_layers_per_step": -1,
        "last_layer_weight": 1.5,
        "prior_layers_weight": 0.75,
        "kl_div_weight": 1.25,
        "kl_temperature": 1.1
    }
    

msmarco_pairs

  • Dataset: msmarco_pairs at 28ff31e
  • Size: 150 evaluation samples
  • Columns: sentence1 and sentence2
  • Approximate statistics based on the first 1000 samples:
    sentence1 sentence2
    type string string
    details
    • min: 4 tokens
    • mean: 8.55 tokens
    • max: 25 tokens
    • min: 17 tokens
    • mean: 76.77 tokens
    • max: 180 tokens
  • Samples:
    sentence1 sentence2
    is there a victoria secret in london As Victoria's Secret opens the doors of its London Bond Street store, a fashion blogger goes undie-cover to give us their verdict. By Emily Johnston. Published: 12:43 EST, 29 August 2012
    when was cafe terrace at night create Café Terrace at Night. Café Terrace at Night, also known as The Cafe Terrace on the Place du Forum, is an oil painting executed by the Dutch artist Vincent van Gogh while at Arles, France, in mid-September 1888. The painting is not signed, but described and mentioned by the artist in three letters.
    what does chohan Chauhan, Chouhan or Chohan is a community sometimes described as a tribe and sometimes as a caste. In the medieval period some those associated with it ruled parts of Northern India and one, Prithviraj Chauhan, was the king of Delhi.ajput bardic accounts, which are based on mythology, describe the Chauhans as one of the four Agnikula Rajput clans who claim to have originated from a sacrificial fire-pit (agnikunda) at Mount Abu. These claims of supernatural origin are clearly improbable and unacceptable to the modern mind.
  • Loss: AdaptiveLayerLoss with these parameters:
    {
        "loss": "GISTEmbedLoss",
        "n_layers_per_step": -1,
        "last_layer_weight": 1.5,
        "prior_layers_weight": 0.75,
        "kl_div_weight": 1.25,
        "kl_temperature": 1.1
    }
    

nq_pairs

  • Dataset: nq_pairs at f9e894e
  • Size: 150 evaluation samples
  • Columns: sentence1 and sentence2
  • Approximate statistics based on the first 1000 samples:
    sentence1 sentence2
    type string string
    details
    • min: 10 tokens
    • mean: 11.69 tokens
    • max: 17 tokens
    • min: 20 tokens
    • mean: 128.35 tokens
    • max: 332 tokens
  • Samples:
    sentence1 sentence2
    how many super bowl rings does the raiders have Oakland Raiders The Raiders are known for their extensive fan base and distinctive team culture. Since 1963, the team has won 15 division titles (three AFL and 12 NFL), four AFC Championships (1976, 1980, 1983, and 2002), one AFL Championship (1967), and three Super Bowl Championships (XI, XV, and XVIII). The Raiders have 14 former members who have been enshrined in the Pro Football Hall of Fame.
    when did libya give up its nuclear weapons Disarmament of Libya In 1968, Libya became signatory of Nuclear Non-Proliferation Treaty (NPT), ratified the treaty in 1975, and concluded a safeguards agreement in 1980. Despite its commitment to NPT, there are reports indicating that Muammar Gaddafi of Libya either made unsuccessful attempts to build or entered in an agreement to purchase a nuclear weapon from nuclear-armed nations. In the 1970s–80s, Gaddafi made numerous attempts to accelerate and push forward his ambitions for an active nuclear weapons program, using the nuclear black market sources. However, after the end of the Cold War in 1991, Gaddafi sought to resolve its nuclear crises with the United States aiming to uplift the sanctions against Libya, finally agreeing to authorize rolling back Libya's weapons of mass destruction program on December 2003.
    how did thomas fire get it's name Thomas Fire On December 4, 2017, the Thomas Fire was reported at 6:26 p.m. PST,[36] to the north of Santa Paula, near Steckel Park and Thomas Aquinas College,[3][24] after which the fire is named.[37] That night, the small brush fire exploded in size and raced through the rugged mountain terrain that lies west of Santa Paula, between Ventura and Ojai.[19][38] Officials blamed strong Santa Ana winds that gusted up to 60 miles per hour (97 km/h) for the sudden expansion.[28][39] Soon after the fire had started, a second blaze was ignited nearly 30 minutes later, about 4 miles (6.4 km) to the north in Upper Ojai at the top of Koenigstein Road.[40] According to eyewitnesses, this second fire was sparked by an explosion in the power line over the area. The second fire was rapidly expanded by the strong Santa Ana winds, and soon merged into the Thomas Fire later that night.[40]
  • Loss: AdaptiveLayerLoss with these parameters:
    {
        "loss": "GISTEmbedLoss",
        "n_layers_per_step": -1,
        "last_layer_weight": 1.5,
        "prior_layers_weight": 0.75,
        "kl_div_weight": 1.25,
        "kl_temperature": 1.1
    }
    

trivia_pairs

  • Dataset: trivia_pairs at a7c36e3
  • Size: 150 evaluation samples
  • Columns: sentence1 and sentence2
  • Approximate statistics based on the first 1000 samples:
    sentence1 sentence2
    type string string
    details
    • min: 9 tokens
    • mean: 15.91 tokens
    • max: 36 tokens
    • min: 58 tokens
    • mean: 447.24 tokens
    • max: 512 tokens
  • Samples:
    sentence1 sentence2
    In the 2009 animated film ‘Up’ who is the voice of explorer Charles F Muntz? Charles F. Muntz
    As at the start of 2003, what is the make and model of the bestselling car of all time? Top 50: Best Selling Cars Of All Time Top 50: Best Selling Cars Of All Time Updated on February 20, 2009 Introduction With all the millions of cars made and sold over the last 100 years, what are the best selling? This Top 50 has all the biggest sellers from around the world. The info on sales has been found all over the net to compile a current list of the big sellers. Any car with only one date and a + after the number is currently in production. The Chevrolet Camaro is not as "in production" because it not due out till Spring 09. 50. Peugeot 405 (1988-1997) - 3,461,800 50. Peugeot 405 (1988-1997) - 3,461,800 49. Peugeot 504: (1968-2005) - 3,713,400 49. Peugeot 504: (1968-2005) - 3,713,400 48. Fiat 127: (1971-1983) - 3,750,000 48. Fiat 127: (1971-1983) - 3,750,000 47. Citroen 2CV: (1948-1990) - 3,872,583 47. Citroen 2CV: (1948-1990) - 3,872,583 46. Fiat 500: (1957- ) - 3,900,000+ 46. Fiat 500: (1957- ) - 3,900,000+ 45. Pontiac Grand Am: (1973-2005) - 4,000,000 45. Pontiac Grand Am: (1973-2005) - 4,000,000 44. Ford Cortina: (1962-1982) - 4,279,079 44. Ford Cortina: (1962-1982) - 4,279,079 43. Ford Model A: (1927-31) - 4,320,446 43. Ford Model A: (1927-31) - 4,320,446 42. Opel Ascona: (1970-1988) - 4,400,000 42. Opel Ascona: (1970-1988) - 4,400,000 41. Fiat 126: (1973-2000) - 4,671,586 41. Fiat 126: (1973-2000) - 4,671,586 40. Chevrolet Camaro: (1967-2002) - 4,800,000 40. Chevrolet Camaro: (1967-2002) - 4,800,000 39. Ford Ranger: (1983- ) - 5,150,000+ 39. Ford Ranger: (1983-) - 5,150,000+ 38. Ford E-Series: (1961- ) - 5,200,000+ 38. Ford E-Series: (1961- ) - 5,200,000+ 37. Peugeot 205: (1983-1998) - 5,278,000 37. Peugeot 205: (1983-1998) - 5,278,000 36. Toyota Land Cruiser: (1953- ) - 5,300,000+ 36. Toyota Land Cruiser: (1953- ) - 5,300,000+ 35. Ford Crown Victoria: (1980- ) - 5,500,000+ 35. Ford Crown Victoria: (1980- ) - 5,500,000+ 34. Ford Focus: (1998- ) - 5,500,000+ 34. Ford Focus: (1998- ) - 5,500,000+ 33. Mitsubishi Galant: (1969- ) - 5,550,000+ 33. Mitsubishi Galant: (1969- ) - 5,550,000+ 32. Ford Explorer: (1991- ) - 5,700,00+ 32. Ford Explorer: (1991- ) - 5,700,00+ 31. Nissan Sunny: (1966- ) - 5,900,000+ 31. Nissan Sunny: (1966- ) - 5,900,000+ 30. Buick Le Sabre: (1959-2005) - 6,000,000 30. Buick Le Sabre: (1959-2005) - 6,000,000 29. Peugeot 206: (1998- 2007 ) - 6,100,000 29. Peugeot 206: (1998-2007) - 6,100,000 28. Chevrolet Cavalier: (1982-2005) - 6,200,000 28. Chevrolet Cavalier: (1982-2005) - 6,200,000 27. Vauxhall/Opel Vectra: (1988-2008) - 6,500,000 27. Vauxhall/Opel Vectra: (1988-2008) - 6,500,000 26. BMC/BL/BMW Mini: (1959- ) - 6,700,000+ 26. BMC/BL/BMW Mini: (1959- ) - 6,700,000+ 25. Ford Taurus: (1986- ) - 6,750,000+ 25. Ford Taurus: (1986- ) - 6,750,000+ 24. Fiat Punto: (1993- ) - 6,800,000+ 24. Fiat Punto: (1993- ) - 6,800,000+ 23. Renault 4: (1961-1992) - 8,150,000 23. Renault 4: (1961-1992) - 8,150,000 22. Ford Mustang: (1964- ) - 8,300,000+ 22. Ford Mustang: (1964- ) - 8,300,000+ 21. Renault 5: (1972-1996) - 8,800,000 21. Renault 5: (1972-1996) - 8,800,000 20. Renault Clio: (1991- ) - 8,900,000+ 20. Renault Clio: (1991- ) - 8,900,000+ 19. Fiat Uno: (1983- ) - 9,150,000+ 19. Fiat Uno: (1983- ) - 9,150,000+ 18. BMW 3-Series: (1977- ) - 9,800,000+ 18. BMW 3-Series: (1977- ) - 9,800,000+ 17. Vauxhall/Opel Astra: (1991- ) - 10,000,000+ 17. Vauxhall/Opel Astra: (1991- ) - 10,000,000+ 16. Mazda 323: (1963-2003) - 10,480,000 16. Mazda 323: (1963-2003) - 10,480,000 15. Toyota Camry: (1983- ) - 10,500,000+ 15. Toyota Camry: (1983- ) - 10,500,000+ 14. Chrysler Voyager: (1984- ) - 11,700,000+ 14. Chrysler Voyager: (1984- ) - 11,700,000+ 13. Oldsmobile Cutlass: (1961-99) - 11,900,000 13. Oldsmobile Cutlass: (1961-99) - 11,900,000 12. Vauxhall/Opel Corsa: (1982- ) - 12,000,000+ 12. Vauxhall/Opel Corsa: (1982- ) - 12,000,000+ 11. Ford Fiesta: (1976- ) - 12,500,000+ 11. Ford Fiesta: (1976- ) - 12,500,000+ 10. Chevrolet Impala: (1958- ) - 14,000,000+ 10. Chevrolet Impala: (1958- ) - 14,000,000+ 9. Volkswagen Passat: (1973- ) - 14,100,000+ 9. Volkswagen Passat: (1973- ) - 14,100,000+ 8. Honda Accord: (1976- ) - 15
    The 1963 film ‘The Birds’ is based on a story by which novelist? The Birds The Birds There are no active dates for this event. Not Available Thursday Jan 21, 2016 7:30 PM - Saturday Feb 20, 2016 7:30 PM
  • Loss: AdaptiveLayerLoss with these parameters:
    {
        "loss": "GISTEmbedLoss",
        "n_layers_per_step": -1,
        "last_layer_weight": 1.5,
        "prior_layers_weight": 0.75,
        "kl_div_weight": 1.25,
        "kl_temperature": 1.1
    }
    

quora_pairs

  • Dataset: quora_pairs at 451a485
  • Size: 150 evaluation samples
  • Columns: sentence1 and sentence2
  • Approximate statistics based on the first 1000 samples:
    sentence1 sentence2
    type string string
    details
    • min: 7 tokens
    • mean: 14.21 tokens
    • max: 37 tokens
    • min: 7 tokens
    • mean: 13.43 tokens
    • max: 30 tokens
  • Samples:
    sentence1 sentence2
    How long does it take for alcohol to leave your system? I'm getting a drug test. How long does it take for alcohol to completely leave your system?
    Whom should one follow on Quora? And why? Which are some of the most viewed writers I should follow from every topic on Quora?
    What happens when best friends fall in love? Love: What is it like to fall in love with your best friend?
  • Loss: AdaptiveLayerLoss with these parameters:
    {
        "loss": "GISTEmbedLoss",
        "n_layers_per_step": -1,
        "last_layer_weight": 1.5,
        "prior_layers_weight": 0.75,
        "kl_div_weight": 1.25,
        "kl_temperature": 1.1
    }
    

gooaq_pairs

  • Dataset: gooaq_pairs at b089f72
  • Size: 150 evaluation samples
  • Columns: sentence1 and sentence2
  • Approximate statistics based on the first 1000 samples:
    sentence1 sentence2
    type string string
    details
    • min: 8 tokens
    • mean: 11.57 tokens
    • max: 16 tokens
    • min: 21 tokens
    • mean: 56.68 tokens
    • max: 114 tokens
  • Samples:
    sentence1 sentence2
    are buses running from wellington to taunton? Is there a direct bus between Wellington and Taunton? Yes, there is a direct bus departing from Wellington, Post Office and arriving at Taunton, County Hall. Services depart every 30 minutes, and operate every day.
    1 kwh is equal to ampere? The electrical charge in amp-hours is equal to the energy in kilowatt-hours times 1,000, then divided by the voltage. For example, let's convert 5 kWh at 120 V to Ah. You might also want to convert watt-hours to milliamp-hours.
    are headaches associated with ptsd? When it comes to headaches, patients with migraine or tension headaches report high rates of exposure to traumatic events. In addition, about 17% have symptoms consistent with a PTSD diagnosis. Another study found that 32 percent of OEF/OIF veterans with PTSD say that they have problems with headaches.
  • Loss: AdaptiveLayerLoss with these parameters:
    {
        "loss": "GISTEmbedLoss",
        "n_layers_per_step": -1,
        "last_layer_weight": 1.5,
        "prior_layers_weight": 0.75,
        "kl_div_weight": 1.25,
        "kl_temperature": 1.1
    }
    

Training Hyperparameters

Non-Default Hyperparameters

  • eval_strategy: steps
  • per_device_train_batch_size: 32
  • per_device_eval_batch_size: 32
  • learning_rate: 3e-05
  • weight_decay: 5e-05
  • num_train_epochs: 5
  • lr_scheduler_type: cosine_with_restarts
  • lr_scheduler_kwargs: {'num_cycles': 3}
  • warmup_ratio: 0.3
  • save_safetensors: False
  • fp16: True
  • push_to_hub: True
  • hub_model_id: bobox/DeBERTa-ST-AllLayers-testing-v3-checkpoints-tmp
  • hub_strategy: all_checkpoints
  • batch_sampler: no_duplicates

All Hyperparameters

Click to expand
  • overwrite_output_dir: False
  • do_predict: False
  • eval_strategy: steps
  • prediction_loss_only: True
  • per_device_train_batch_size: 32
  • per_device_eval_batch_size: 32
  • per_gpu_train_batch_size: None
  • per_gpu_eval_batch_size: None
  • gradient_accumulation_steps: 1
  • eval_accumulation_steps: None
  • learning_rate: 3e-05
  • weight_decay: 5e-05
  • adam_beta1: 0.9
  • adam_beta2: 0.999
  • adam_epsilon: 1e-08
  • max_grad_norm: 1.0
  • num_train_epochs: 5
  • max_steps: -1
  • lr_scheduler_type: cosine_with_restarts
  • lr_scheduler_kwargs: {'num_cycles': 3}
  • warmup_ratio: 0.3
  • warmup_steps: 0
  • log_level: passive
  • log_level_replica: warning
  • log_on_each_node: True
  • logging_nan_inf_filter: True
  • save_safetensors: False
  • save_on_each_node: False
  • save_only_model: False
  • restore_callback_states_from_checkpoint: False
  • no_cuda: False
  • use_cpu: False
  • use_mps_device: False
  • seed: 42
  • data_seed: None
  • jit_mode_eval: False
  • use_ipex: False
  • bf16: False
  • fp16: True
  • fp16_opt_level: O1
  • half_precision_backend: auto
  • bf16_full_eval: False
  • fp16_full_eval: False
  • tf32: None
  • local_rank: 0
  • ddp_backend: None
  • tpu_num_cores: None
  • tpu_metrics_debug: False
  • debug: []
  • dataloader_drop_last: False
  • dataloader_num_workers: 0
  • dataloader_prefetch_factor: None
  • past_index: -1
  • disable_tqdm: False
  • remove_unused_columns: True
  • label_names: None
  • load_best_model_at_end: False
  • ignore_data_skip: False
  • fsdp: []
  • fsdp_min_num_params: 0
  • fsdp_config: {'min_num_params': 0, 'xla': False, 'xla_fsdp_v2': False, 'xla_fsdp_grad_ckpt': False}
  • fsdp_transformer_layer_cls_to_wrap: None
  • accelerator_config: {'split_batches': False, 'dispatch_batches': None, 'even_batches': True, 'use_seedable_sampler': True, 'non_blocking': False, 'gradient_accumulation_kwargs': None}
  • deepspeed: None
  • label_smoothing_factor: 0.0
  • optim: adamw_torch
  • optim_args: None
  • adafactor: False
  • group_by_length: False
  • length_column_name: length
  • ddp_find_unused_parameters: None
  • ddp_bucket_cap_mb: None
  • ddp_broadcast_buffers: False
  • dataloader_pin_memory: True
  • dataloader_persistent_workers: False
  • skip_memory_metrics: True
  • use_legacy_prediction_loop: False
  • push_to_hub: True
  • resume_from_checkpoint: None
  • hub_model_id: bobox/DeBERTa-ST-AllLayers-testing-v3-checkpoints-tmp
  • hub_strategy: all_checkpoints
  • hub_private_repo: False
  • hub_always_push: False
  • gradient_checkpointing: False
  • gradient_checkpointing_kwargs: None
  • include_inputs_for_metrics: False
  • eval_do_concat_batches: True
  • fp16_backend: auto
  • push_to_hub_model_id: None
  • push_to_hub_organization: None
  • mp_parameters:
  • auto_find_batch_size: False
  • full_determinism: False
  • torchdynamo: None
  • ray_scope: last
  • ddp_timeout: 1800
  • torch_compile: False
  • torch_compile_backend: None
  • torch_compile_mode: None
  • dispatch_batches: None
  • split_batches: None
  • include_tokens_per_second: False
  • include_num_input_tokens_seen: False
  • neftune_noise_alpha: None
  • optim_target_modules: None
  • batch_eval_metrics: False
  • batch_sampler: no_duplicates
  • multi_dataset_batch_sampler: proportional

Training Logs

Epoch Step Training Loss msmarco pairs loss openbookqa pairs loss scitail-pairs-pos loss trivia pairs loss gooaq pairs loss compression-pairs loss nli-pairs loss vitaminc-pairs loss scitail-pairs-qa loss xsum-pairs loss quora pairs loss qasc pairs loss qnli-contrastive loss sciq pairs loss nq pairs loss sts-test_spearman_cosine
0.0250 147 16.851 - - - - - - - - - - - - - - - -
0.0501 294 11.2787 - - - - - - - - - - - - - - - -
0.0751 441 8.9166 - - - - - - - - - - - - - - - -
0.1001 588 7.9463 - - - - - - - - - - - - - - - -
0.1251 735 7.2108 8.7543 7.8627 3.9920 9.0511 7.5791 3.3038 6.9057 5.7209 3.7859 4.6004 4.5232 10.5803 8.1650 10.2145 8.4155 0.3721
0.1502 882 6.7709 - - - - - - - - - - - - - - - -
0.1752 1029 6.1746 - - - - - - - - - - - - - - - -
0.2002 1176 5.7706 - - - - - - - - - - - - - - - -
0.2252 1323 5.7283 - - - - - - - - - - - - - - - -
0.2503 1470 5.1856 4.8449 5.6524 2.4573 5.2907 3.9708 2.0630 4.3521 3.4988 1.3250 3.0712 1.5572 9.2457 12.9156 9.0681 5.0240 0.6368
0.2753 1617 4.185 - - - - - - - - - - - - - - - -
0.3003 1764 4.6367 - - - - - - - - - - - - - - - -
0.3253 1911 4.3615 - - - - - - - - - - - - - - - -
0.3504 2058 4.1791 - - - - - - - - - - - - - - - -
0.3754 2205 4.1051 3.9567 5.0910 1.8895 4.2864 3.2224 1.4911 3.2717 2.7198 0.8107 2.2627 1.1233 8.1039 9.6387 8.5974 4.0091 0.6653
0.4004 2352 3.7674 - - - - - - - - - - - - - - - -
0.4254 2499 3.8729 - - - - - - - - - - - - - - - -
0.4505 2646 3.4527 - - - - - - - - - - - - - - - -
0.4755 2793 3.3545 - - - - - - - - - - - - - - - -
0.5005 2940 3.3247 3.4786 4.6194 1.4237 3.5245 2.6586 1.1591 2.7122 2.2260 0.5898 1.8389 0.9096 7.8180 4.9263 8.2825 3.3450 0.6920
0.5255 3087 3.116 - - - - - - - - - - - - - - - -
0.5506 3234 3.2418 - - - - - - - - - - - - - - - -
0.5756 3381 3.0757 - - - - - - - - - - - - - - - -
0.6006 3528 2.8524 - - - - - - - - - - - - - - - -
0.6256 3675 2.6875 3.0210 4.2169 1.1910 3.1736 2.3525 0.8454 2.4791 1.9743 0.4400 1.4812 0.7636 6.9316 1.7706 8.0147 2.9561 0.7013
0.6507 3822 2.7808 - - - - - - - - - - - - - - - -
0.6757 3969 2.5687 - - - - - - - - - - - - - - - -
0.7007 4116 2.3034 - - - - - - - - - - - - - - - -
0.7257 4263 2.4412 - - - - - - - - - - - - - - - -
0.7508 4410 2.3293 2.7029 3.8574 1.0498 2.8798 2.0472 0.5027 2.3226 1.7957 0.3697 1.1691 0.6825 6.4047 1.0079 7.8237 2.6794 0.7122
0.7758 4557 2.3651 - - - - - - - - - - - - - - - -
0.8008 4704 2.6296 - - - - - - - - - - - - - - - -
0.8258 4851 2.2108 - - - - - - - - - - - - - - - -
0.8509 4998 2.1852 - - - - - - - - - - - - - - - -
0.8759 5145 2.2944 2.3863 3.7141 0.9187 2.4948 1.8280 0.4108 2.0635 1.6387 0.3160 1.0602 0.6137 6.3538 0.9640 7.5778 2.3543 0.7283
0.9009 5292 2.2133 - - - - - - - - - - - - - - - -
0.9259 5439 2.2255 - - - - - - - - - - - - - - - -
0.9510 5586 2.3502 - - - - - - - - - - - - - - - -
0.9760 5733 1.8964 - - - - - - - - - - - - - - - -
1.0010 5880 1.913 2.1638 3.4724 0.8628 2.3711 1.7041 0.3047 1.9677 1.4394 0.2668 0.9014 0.5216 5.9478 0.4572 1.0916 2.1109 0.7388
1.0260 6027 1.7772 - - - - - - - - - - - - - - - -
1.0511 6174 1.9079 - - - - - - - - - - - - - - - -
1.0761 6321 1.8657 - - - - - - - - - - - - - - - -
1.1011 6468 1.7144 - - - - - - - - - - - - - - - -
1.1261 6615 1.7661 2.0444 3.3518 0.7724 2.3691 1.5796 0.2659 1.7908 1.3404 0.2244 0.8371 0.4785 5.7539 0.2737 0.9384 1.9409 0.7446
1.1512 6762 1.8066 - - - - - - - - - - - - - - - -
1.1762 6909 1.7438 - - - - - - - - - - - - - - - -
1.2012 7056 2.0231 - - - - - - - - - - - - - - - -
1.2263 7203 1.8966 - - - - - - - - - - - - - - - -
1.2513 7350 1.7958 1.8952 3.1631 0.7215 1.9967 1.3951 0.2498 1.5906 1.2226 0.1778 0.7920 0.4054 5.4840 0.3951 0.8344 1.6935 0.7535
1.2763 7497 1.5109 - - - - - - - - - - - - - - - -
1.3013 7644 1.8119 - - - - - - - - - - - - - - - -
1.3264 7791 1.6833 - - - - - - - - - - - - - - - -
1.3514 7938 1.5917 - - - - - - - - - - - - - - - -
1.3764 8085 1.809 1.7568 2.9011 0.6572 1.8419 1.1746 0.2301 1.5968 1.1435 0.1577 0.7029 0.3561 5.4334 0.3819 0.7997 1.5408 0.7594
1.4014 8232 1.5561 - - - - - - - - - - - - - - - -
1.4265 8379 1.5325 - - - - - - - - - - - - - - - -
1.4515 8526 1.5085 - - - - - - - - - - - - - - - -
1.4765 8673 1.5634 - - - - - - - - - - - - - - - -

Framework Versions

  • Python: 3.10.13
  • Sentence Transformers: 3.0.1
  • Transformers: 4.41.2
  • PyTorch: 2.1.2
  • Accelerate: 0.30.1
  • Datasets: 2.19.2
  • Tokenizers: 0.19.1

Citation

BibTeX

Sentence Transformers

@inproceedings{reimers-2019-sentence-bert,
    title = "Sentence-BERT: Sentence Embeddings using Siamese BERT-Networks",
    author = "Reimers, Nils and Gurevych, Iryna",
    booktitle = "Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing",
    month = "11",
    year = "2019",
    publisher = "Association for Computational Linguistics",
    url = "https://arxiv.org/abs/1908.10084",
}

AdaptiveLayerLoss

@misc{li20242d,
    title={2D Matryoshka Sentence Embeddings}, 
    author={Xianming Li and Zongxi Li and Jing Li and Haoran Xie and Qing Li},
    year={2024},
    eprint={2402.14776},
    archivePrefix={arXiv},
    primaryClass={cs.CL}
}

CoSENTLoss

@online{kexuefm-8847,
    title={CoSENT: A more efficient sentence vector scheme than Sentence-BERT},
    author={Su Jianlin},
    year={2022},
    month={Jan},
    url={https://kexue.fm/archives/8847},
}

GISTEmbedLoss

@misc{solatorio2024gistembed,
    title={GISTEmbed: Guided In-sample Selection of Training Negatives for Text Embedding Fine-tuning}, 
    author={Aivin V. Solatorio},
    year={2024},
    eprint={2402.16829},
    archivePrefix={arXiv},
    primaryClass={cs.LG}
}