File size: 4,865 Bytes
44246dc 79c7c89 b5069db d402e88 22e5d80 b450c82 22e5d80 b450c82 7dbd128 b450c82 7dbd128 b450c82 7dbd128 b450c82 7dbd128 ec7dd7f 44246dc 79c7c89 579aaa7 79c7c89 ad78659 79c7c89 7dbd128 579aaa7 72dba0a 79c7c89 5bd7699 28a2355 a7a9d3f 79c7c89 5e8d9a8 79c7c89 a7a9d3f 79c7c89 a7a9d3f 79c7c89 a7a9d3f 79c7c89 a7a9d3f 79c7c89 a7a9d3f 79c7c89 b5069db |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 |
---
license: apache-2.0
language:
- en
pipeline_tag: fill-mask
widget:
- text: >-
The Standard Model (SM) of [MASK] physics has been tested by many
experiments over the last four decades and has been shown to successfully
describe high energy particle interactions.
example_title: particle physics
- text: >-
Clear evidence for the production of a neutral boson with a measured mass of
[MASK].0 ± 0.4 (stat) ± 0.4 (sys) GeV is presented.
example_title: 126.0 ± 0.4 (stat) ± 0.4 (sys) GeV
- text: >-
An excess of [MASK] is observed above the expected background, with a local
significance of 5.0 standard deviations, at a mass near 125 GeV, signalling
the production of a new particle.
example_title: excess of events
- text: >-
On September 14, 2015 at 09:50:45 UTC the two [MASK] of the Laser
Interferometer Gravitational-Wave Observatory simultaneously observed a
transient gravitational-wave signal.
example_title: two detectors
- text: >-
These first images from the EHT achieve the highest [MASK] resolution in the
history of ground-based VLBI.
example_title: angular resolution
- text: >-
We propose a comprehensive theory of [MASK] matter that explains the recent
proliferation of unexpected observations in high-energy astrophysics.
example_title: dark matter
- text: >-
Formation of galaxy clusters corresponds to the collapse of the largest
gravitationally bound overdensities in the initial [MASK] field and is
accompanied by the most energetic phenomena since the Big Bang and by the
complex interplay between gravity-induced dynamics of collapse and baryonic
processes associated with galaxy formation.
example_title: initial density field
- text: >-
The Event [MASK] Telescope (EHT) has led to the first images of a
supermassive black hole, revealing the central compact objects in the
elliptical galaxy M87 and the Milky Way.
example_title: Event Horizon Telescope
datasets:
- wikipedia
- bookcorpus
- arnosimons/astro-hep-corpus
tags:
- arXiv
- astrophysics
- conceptual analysis
- epistemic change
- high-energy physics (HEP)
- history of science
- semantic shift detection
- sociology of science
- philosophy of science
- physics
- word embeddings
---
# Model Card for Astro-HEP-BERT
**Astro-HEP-BERT** is a bidirectional transformer designed primarily to generate contextualized word embeddings for computational conceptual analysis in astrophysics and high-energy physics (HEP). Built upon Google's `bert-base-uncased`, the model underwent additional training for three epochs using the <a target="_blank" rel="noopener noreferrer" href="https://huggingface.co/datasets/arnosimons/astro-hep-corpus">Astro-HEP Corpus</a>, containing 21.84 million paragraphs found in more than 600,000 scholarly articles sourced from arXiv, all pertaining to astrophysics and/or high-energy physics (HEP). The sole training objective was masked language modeling.
The Astro-HEP-BERT project demonstrates the general feasibility of training a customized bidirectional transformer for computational conceptual analysis in the history, philosophy, and sociology of science as an open-source endeavor that does not require a substantial budget. Leveraging only freely available code, weights, and text inputs, the entire training process was conducted on a single MacBook Pro Laptop (M2/96GB).
For further insights into the model, the corpus, and the underlying research project (<a target="_blank" rel="noopener noreferrer" href="https://doi.org/10.3030/101044932" >Network Epistemology in Practice</a>) please refer to the following two papers:
1) <a target="_blank" rel="noopener noreferrer" href="https://arxiv.org/abs/2411.14877">Simons, A. (2024). Astro-HEP-BERT: A bidirectional language model for studying the meanings of concepts in astrophysics and high energy physics. arXiv:2411.14877.</a>
2) <a target="_blank" rel="noopener noreferrer" href="https://arxiv.org/abs/2411.14073">Simons, A. (2024). Meaning at the planck scale? Contextualized word embeddings for doing history, philosophy, and sociology of science. arXiv:2411.14073.</a>
## Model Details
- **Developer:** <a target="_blank" rel="noopener noreferrer" href="https://www.tu.berlin/en/hps-mod-sci/arno-simons">Arno Simons</a>
- **Funded by:** The European Union under Grant agreement ID: <a target="_blank" rel="noopener noreferrer" href="https://doi.org/10.3030/101044932" >101044932</a>
- **Language (NLP):** English
- **License:** apache-2.0
- **Parent model:** Google's <a target="_blank" rel="noopener noreferrer" href="https://github.com/google-research/bert">`bert-base-uncased`</a>
<!---
## How to Get Started with the Model
Use the code below to get started with the model.
[Coming soon]
## Citation
**BibTeX:**
[Coming soon]
**APA:**
[Coming soon]
--> |