nlpaueb commited on
Commit
0077cd2
1 Parent(s): c6b4b0d

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +4 -4
README.md CHANGED
@@ -13,7 +13,7 @@ widget:
13
 
14
  <img align="left" src="https://i.ibb.co/p3kQ7Rw/Screenshot-2020-10-06-at-12-16-36-PM.png" width="100"/>
15
 
16
- LEGAL-BERT is a family of BERT models for the legal domain, intended to assist legal NLP research, computational law, and legal technology applications. To pre-train the different variations of LEGAL-BERT, we collected 12 GB of diverse English legal text from several fields (e.g., legislation, court cases, contracts) scraped from publicly available resources. Sub-domains variants (CONTRACTS-, EURLEX-, ECHR-) and/or general LEGAL-BERT perform better than using BERT out of the box for domain-specific tasks. A light-weight model (33% the size of BERT-BASE) pre-trained from scratch on legal data with competitive perfomance is also available.
17
  <br/><br/><br/><br/>
18
 
19
  ---
@@ -40,7 +40,7 @@ The pre-training corpora of LEGAL-BERT include:
40
 
41
  ## Pre-training details
42
 
43
- * We trained BERT using the official code provided in Google BERT's github repository (https://github.com/google-research/bert).
44
  * We released a model similar to the English BERT-BASE model (12-layer, 768-hidden, 12-heads, 110M parameters).
45
  * We chose to follow the same training set-up: 1 million training steps with batches of 256 sequences of length 512 with an initial learning rate 1e-4.
46
  * We were able to use a single Google Cloud TPU v3-8 provided for free from [TensorFlow Research Cloud (TFRC)](https://www.tensorflow.org/tfrc), while also utilizing [GCP research credits](https://edu.google.com/programs/credits/research). Huge thanks to both Google programs for supporting us!
@@ -124,6 +124,6 @@ Consider the experiments in the article "LEGAL-BERT: The Muppets straight out of
124
  }
125
  ```
126
 
127
- Ilias Chalkidis on behalf of [AUEB's Natural Language Processing Group](http://nlp.cs.aueb.gr)
128
 
129
- | Github: [@ilias.chalkidis](https://github.com/seolhokim) | Twitter: [@KiddoThe2B](https://twitter.com/KiddoThe2B) |
13
 
14
  <img align="left" src="https://i.ibb.co/p3kQ7Rw/Screenshot-2020-10-06-at-12-16-36-PM.png" width="100"/>
15
 
16
+ LEGAL-BERT is a family of BERT models for the legal domain, intended to assist legal NLP research, computational law, and legal technology applications. To pre-train the different variations of LEGAL-BERT, we collected 12 GB of diverse English legal text from several fields (e.g., legislation, court cases, contracts) scraped from publicly available resources. Sub-domain variants (CONTRACTS-, EURLEX-, ECHR-) and/or general LEGAL-BERT perform better than using BERT out of the box for domain-specific tasks. A light-weight model (33% the size of BERT-BASE) pre-trained from scratch on legal data with competitive performance is also available.
17
  <br/><br/><br/><br/>
18
 
19
  ---
40
 
41
  ## Pre-training details
42
 
43
+ * We trained BERT using the official code provided in Google BERT's GitHub repository (https://github.com/google-research/bert).
44
  * We released a model similar to the English BERT-BASE model (12-layer, 768-hidden, 12-heads, 110M parameters).
45
  * We chose to follow the same training set-up: 1 million training steps with batches of 256 sequences of length 512 with an initial learning rate 1e-4.
46
  * We were able to use a single Google Cloud TPU v3-8 provided for free from [TensorFlow Research Cloud (TFRC)](https://www.tensorflow.org/tfrc), while also utilizing [GCP research credits](https://edu.google.com/programs/credits/research). Huge thanks to both Google programs for supporting us!
124
  }
125
  ```
126
 
127
+ [Ilias Chalkidis](https://iliaschalkidis.github.io) on behalf of [AUEB's Natural Language Processing Group](http://nlp.cs.aueb.gr)
128
 
129
+ | Github: [@ilias.chalkidis](https://github.com/iliaschalkidis) | Twitter: [@KiddoThe2B](https://twitter.com/KiddoThe2B) |