--- license: cc-by-4.0 language: - he inference: false --- # **DictaLM-rab**: A Large Generative Language Model for Rabbinic Hebrew A large generative pretrained transformer (GPT) language model for Hebrew, released [here](https://arxiv.org/abs/2309.14568). - This is an alpha version of the model, and there are many improvements to come. - We are actively working on improving the model, so stay tuned. This is the base-model pretrained on general text completion. On it's own, it isn't very useful, but it can be fine-tuned for specific tasks (instruct, chat, QA, and more). This model differs from the regular [DictaLM](https://huggingface.co/dicta-il/dictalm-7b/) regarding the training data used for pretraining. The regular `DictaLM` was pretrained on modern texts only, and this model (`DictaLM-Rab`) was pretrained on a mixture of 50% modern texts and 50% rabbinic/historical texts. ## Sample usage (for text completion): ```python from transformers import AutoModelForCausalLM, AutoTokenizer import torch tokenizer = AutoTokenizer.from_pretrained('dicta-il/dictalm-rab-7b') model = AutoModelForCausalLM.from_pretrained('dicta-il/dictalm-rab-7b', trust_remote_code=True).cuda() model.eval() with torch.inference_mode(): prompt = 'אמר רב יהודה אמר שמואל הכותב' kwargs = dict( inputs=tokenizer(prompt, return_tensors='pt').input_ids.to(model.device), do_sample=True, top_k=50, top_p=0.95, temperature=0.75, max_length=100, min_new_tokens=5 ) print(tokenizer.batch_decode(model.generate(**kwargs), skip_special_tokens=True)) ``` There are many different parameters you can input into `kwargs` for different results (greedy, beamsearch, different samplign configurations, longer/shorter respones, etc.). You can view the full list of parameters you can pass to the `generate` function [here](https://huggingface.co/docs/transformers/v4.33.0/en/main_classes/text_generation#transformers.GenerationMixin.generate). ### Alternative ways to initialize the model: If you have multiple smaller GPUs, and the package `accelerate` is installed, you can initialize the model split across the devices: ```python model = AutoModelForCausalLM.from_pretrained('dicta-il/dictalm-rab-7b', trust_remote_code=True, device_map='auto') ``` If you are running on linux and have the `bitsandbytes` package installed, you can initialize the model in 4/8 bit inference mode: ```python model = AutoModelForCausalLM.from_pretrained('dicta-il/dictalm-rab-7b', trust_remote_code=True, load_in_8bit=True) ``` If you have [FlashAttention](https://github.com/Dao-AILab/flash-attention) installed in your environment, you can instruct the model to use the flash attention implementation (either V1 or V2, whichever is installed): ```python model = AutoModelForCausalLM.from_pretrained('dicta-il/dictalm-rab-7b', trust_remote_code=True, use_flash_attention=True) ``` ## Citation If you use DictaLM in your research, please cite ```DictaLM -- A Large Generative Language Model for Modern Hebrew``` **BibTeX:** ```bibtex @misc{shmidman2023introducing, title={Introducing DictaLM -- A Large Generative Language Model for Modern Hebrew}, author={Shaltiel Shmidman and Avi Shmidman and Amir David Nissan Cohen and Moshe Koppel}, year={2023}, eprint={2309.14568}, archivePrefix={arXiv}, primaryClass={cs.CL} } ``` ## License Shield: [![CC BY 4.0][cc-by-shield]][cc-by] This work is licensed under a [Creative Commons Attribution 4.0 International License][cc-by]. [![CC BY 4.0][cc-by-image]][cc-by] [cc-by]: http://creativecommons.org/licenses/by/4.0/ [cc-by-image]: https://i.creativecommons.org/l/by/4.0/88x31.png [cc-by-shield]: https://img.shields.io/badge/License-CC%20BY%204.0-lightgrey.svg