astro-hep-bert / README.md
arnosimons's picture
Update README.md
28a2355 verified
metadata
license: apache-2.0
language:
  - en
pipeline_tag: fill-mask
widget:
  - text: >-
      The Standard Model (SM) of [MASK] physics has been tested by many
      experiments over the last four decades and has been shown to successfully
      describe high energy particle interactions.
    example_title: particle physics
  - text: >-
      Clear evidence for the production of a neutral boson with a measured mass
      of [MASK].0 ± 0.4 (stat) ± 0.4 (sys) GeV is presented.
    example_title: 126.0 ± 0.4 (stat) ± 0.4 (sys) GeV
  - text: >-
      An excess of [MASK] is observed above the expected background, with a
      local significance of 5.0 standard deviations, at a mass near 125 GeV,
      signalling the production of a new particle.
    example_title: excess of events
  - text: >-
      On September 14, 2015 at 09:50:45 UTC the two [MASK] of the Laser
      Interferometer Gravitational-Wave Observatory simultaneously observed a
      transient gravitational-wave signal.
    example_title: two detectors
  - text: >-
      These first images from the EHT achieve the highest [MASK] resolution in
      the history of ground-based VLBI.
    example_title: angular resolution
  - text: >-
      We propose a comprehensive theory of [MASK] matter that explains the
      recent proliferation of unexpected observations in high-energy
      astrophysics.
    example_title: dark matter
  - text: >-
      Formation of galaxy clusters corresponds to the collapse of the largest
      gravitationally bound overdensities in the initial [MASK] field and is
      accompanied by the most energetic phenomena since the Big Bang and by the
      complex interplay between gravity-induced dynamics of collapse and
      baryonic processes associated with galaxy formation.
    example_title: initial density field
  - text: >-
      The Event [MASK] Telescope (EHT) has led to the first images of a
      supermassive black hole, revealing the central compact objects in the
      elliptical galaxy M87 and the Milky Way.
    example_title: Event Horizon Telescope
datasets:
  - wikipedia
  - bookcorpus
tags:
  - physics
  - astrophysics
  - high-energy physics (HEP)
  - history of science
  - philosophy of science
  - sociology of science
  - word embeddings
  - semantic shift detection
  - conceptual change
  - epistemic change
  - arXiv

Model Card for Astro-HEP-BERT

Astro-HEP-BERT is a bidirectional transformer designed primarily to generate contextualized word embeddings for computational conceptual analysis in astrophysics and high-energy physics (HEP). Built upon Google's bert-base-uncased, the model underwent additional training for three epochs using 21.84 million paragraphs found in more than 600,000 scholarly articles sourced from arXiv, all pertaining to astrophysics and/or high-energy physics (HEP). The sole training objective was masked language modeling.

The Astro-HEP-BERT project embodies the spirit of a tabletop experiment or grassroots scientific effort. It exclusively utilized open-source inputs during training, and the entire training process was completed on a single MacBook Pro M2/96GB in 48 days for 3 epochs. This project stands as a proof of concept, showcasing the viability of employing a bidirectional transformer for research ventures in the history, philosophy, and sociology of science (HPSS) even with limited financial resources.

For further insights into the model, the corpus, and the underlying research project (Network Epistemology in Practice) please refer to the Astro-HEP-BERT paper [link coming soon].

Model Details