Hebrew Language Model
State-of-the-art RoBERTa language model for Hebrew.
How to use
from transformers import AutoModelForMaskedLM, AutoTokenizer
tokenizer = AutoTokenizer.from_pretrained('HeNLP/HeRo')
model = AutoModelForMaskedLM.from_pretrained('HeNLP/HeRo'
# Tokenization Example:
# Tokenizing
tokenized_string = tokenizer('שלום לכולם')
# Decoding
decoded_string = tokenizer.decode(tokenized_string ['input_ids'], skip_special_tokens=True)
Citing
If you use HeRo in your research, please cite HeRo: RoBERTa and Longformer Hebrew Language Models.
@article{shalumov2023hero,
title={HeRo: RoBERTa and Longformer Hebrew Language Models},
author={Vitaly Shalumov and Harel Haskey},
year={2023},
journal={arXiv:2304.11077},
}
- Downloads last month
- 140
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social
visibility and check back later, or deploy to Inference Endpoints (dedicated)
instead.