howey
/

HDT-E

Transformers

English

Inference Endpoints

Model card Files Files and versions Community

howey commited on Jul 14, 2024

Commit

300e95c

•

1 Parent(s): 8d82957

Create README.md

Browse files

Files changed (1) hide show

README.md +51 -0

README.md ADDED Viewed

	@@ -0,0 +1,51 @@

+---
+library_name: transformers
+license: apache-2.0
+language:
+- en
+datasets:
+- howey/unarXive
+- howey/wiki_en
+- howey/hupd
+---
+## Using HDT
+To use the pre-trained model for masked language modeling, use the following snippet:
+```python
+from transformers import AutoModelForMaskedLM, AutoTokenizer
+# See the `MDLM` collection page on the hub for list of available models.
+tokenizer = transformers.AutoTokenizer.from_pretrained('google-bert/bert-base-uncased')
+model_name = 'howey/HDT-E'
+model = AutoModelForMaskedLM.from_pretrained(model_name)
+```
+For more details, please see our github repository: [HDT](https://github.com/autonomousvision/hdt)
+## Model Details
+The model, which has a context length of `8192` and is similar in size to BERT with approximately `110M` parameters,
+was trained on standard masked language modeling task with a Transformer-based architecture using our proposed hierarchical attention.
+The training regimen comprised 24 hours on the ArXiv+Wikipedia+HUPD corpus, involving the processing of a total of `160 million` tokens.
+For more details, please see our paper: [HDT: Hierarchical Document Transformer](https://arxiv.org/pdf/2407.08330).
+## Citation
+<!-- If there is a paper or blog post introducing the model, the Bibtex information for that should go in this section. -->
+Please cite our work using the bibtex below:
+**BibTeX:**
+```
+@inproceedings{He2024COLM,
+      title={HDT: Hierarchical Document Transformer},
+      author={Haoyu He and Markus Flicke and Jan Buchmann and Iryna Gurevych and Andreas Geiger},
+      year={2024},
+      booktitle={Conference on Language Modeling}
+}
+```
+## Model Card Contact
+Haoyu (haoyu.he@uni-tuebingen.de)