howey
/

HDT-E

Inference Endpoints

Model card Files Files and versions Community

HDT-E / README.md

howey's picture

Update README.md

97d11af verified 6 months ago

|

history blame contribute delete

1.7 kB

	---
	library_name: transformers
	license: apache-2.0
	language:
	- en
	datasets:
	- howey/unarXive
	- howey/wiki_en
	- howey/hupd
	---
	# Model Weights Comming Soon!
	## Using HDT
	To use the pre-trained model for masked language modeling, use the following snippet:
	```python
	from transformers import AutoModelForMaskedLM, AutoTokenizer

	# See the `MDLM` collection page on the hub for list of available models.
	tokenizer = transformers.AutoTokenizer.from_pretrained('howey/HDT-E')
	model_name = 'howey/HDT-E'
	model = AutoModelForMaskedLM.from_pretrained(model_name)
	```

	For more details, please see our github repository: [HDT](https://github.com/autonomousvision/hdt)

	## Model Details
	The model, which has a context length of `8192` and is similar in size to BERT with approximately `110M` parameters,
	was trained on standard masked language modeling task with a Transformer-based architecture using our proposed hierarchical attention.
	The training regimen comprised 24 hours on the ArXiv+Wikipedia+HUPD corpus, involving the processing of a total of `1.3 billion` tokens.

	For more details, please see our paper: [HDT: Hierarchical Document Transformer](https://arxiv.org/pdf/2407.08330).



	## Citation

	<!-- If there is a paper or blog post introducing the model, the Bibtex information for that should go in this section. -->
	Please cite our work using the bibtex below:

	BibTeX:

	```
	@inproceedings{He2024COLM,
	title={HDT: Hierarchical Document Transformer},
	author={Haoyu He and Markus Flicke and Jan Buchmann and Iryna Gurevych and Andreas Geiger},
	year={2024},
	booktitle={Conference on Language Modeling}
	}
	```

	## Model Card Contact
	Haoyu (haoyu.he@uni-tuebingen.de)