Edit model card

DictaLM: A Large Generative Language Model for Modern Hebrew

A large generative pretrained transformer (GPT) language model for Hebrew, released here.

  • This is an alpha version of the model, and there are many improvements to come.
  • We are actively working on improving the model, so stay tuned.

This model was fine-tuned for instructions, here are a few examples of the different types of instructions the model was trained on:

  • General questions:

    ืžื” ื–ื” ื‘ื™ืช ืกืคืจ?
    
    ืงื™ื‘ืœืชื™ ื—ืชืš ืงืœ ื‘ืืฆื‘ืข. ืžื”ื™ ื”ื“ืจืš ื”ื ื›ื•ื ื” ืœื˜ืคืœ ื‘ื–ื”?
    
  • Simple tasks:

    ืชืฆื™ืข ื›ืžื” ืจืขื™ื•ื ื•ืช ืœืคืขื™ืœื•ืช ืขื ื™ืœื“ื™ื ื‘ื ื™ 5:
    
  • Information retrieval from a paragraph context:

        ื”ืžืกื™ืง ื”ื™ื“ื ื™ ื”ื•ื ื”ื“ืจืš ื”ืžืกื•ืจืชื™ืช ื•ื”ืขืชื™ืงื” ืœืงื˜ื™ืฃ ื–ื™ืชื™ื. ืฉื™ื˜ื” ื–ื• ื“ื•ืจืฉืช ื›ื•ื— ืื“ื ืจื‘ ื‘ืื•ืคืŸ ื™ื—ืกื™ ื•ืขื“ื™ื™ืŸ ืžืงื•ื‘ืœืช ื‘ื™ืฉืจืืœ ื•ื‘ืžืงื•ืžื•ืช ืจื‘ื™ื ื‘ืขื•ืœื. ืฉื™ื˜ื•ืช ืžืกื™ืง ื™ื“ื ื™ ืžืืคืฉืจื•ืช ื—ื™ืกื›ื•ืŸ ืขืœื•ื™ื•ืช ื‘ืžืงื•ืžื•ืช ื‘ื”ื ื›ื•ื— ื”ืื“ื ื–ื•ืœ ื•ืขืœื•ืช ื”ืฉื™ื˜ื•ืช ื”ืžืžื•ื›ื ื•ืช ื’ื‘ื•ื”ื”. ืœื–ื™ืชื™ื ื”ืžื™ื•ืขื“ื™ื ืœืžืื›ืœ (ืœื›ื‘ื™ืฉื”, ื‘ื ื™ื’ื•ื“ ืœื–ื™ืชื™ื ืœืฉืžืŸ) ืžืชืื™ื ื™ื•ืชืจ ืžืกื™ืง ื™ื“ื ื™ ื›ื™ื•ื•ืŸ ืฉื”ืคืจื™ ืคื—ื•ืช ื ืคื’ืข ื‘ืžื”ืœืš ื”ืžืกื™ืง ื‘ืฉื™ื˜ื” ื–ื• (ืคื’ื™ืขื•ืช ื‘ืงืœื™ืคืช ื”ืคืจื™ ื‘ื–ื™ืชื™ื ืœืฉืžืŸ ืคื—ื•ืช ืžืฉืžืขื•ืชื™ื•ืช). ื›ืžื• ื›ืŸ ืžื•ืขื“ืฃ ืžืกื™ืง ื™ื“ื ื™ ื‘ืื–ื•ืจื™ื ื‘ื”ื ื”ื˜ื•ืคื•ื’ืจืคื™ื” ื”ืžืงื•ืžื™ืช ืื• ืฆืคื™ืคื•ืช ื”ืขืฆื™ื ืœื ืžืืคืฉืจื™ื ื’ื™ืฉื” ื ื•ื—ื” ืœื›ืœื™ื ืžื›ื ื™ื. ื”ืฉื™ื˜ื” ื”ื™ื“ื ื™ืช ืžืืคืฉืจืช ื’ื ืœืžืกื•ืง ืขืฆื™ื ืฉื•ื ื™ื ื‘ืžื•ืขื“ื™ื ืฉื•ื ื™ื, ื‘ื”ืชืื ืœืงืฆื‘ ื”ื‘ืฉืœืช ื”ืคืจื™ ื”ื˜ื‘ืขื™ ื‘ื›ืœ ืขืฅ.
        
        ืขืœ ื‘ืกื™ืก ื”ืคืกืงื” ื”ื–ืืช, ืžื” ื”ื•ื ื”ื™ืชืจื•ืŸ ืฉืœ ืžืกื™ืง ื™ื“ื ื™ ืžื‘ื—ื™ื ืช ืงืฆื‘ ื”ื‘ืฉืœืช ื”ืคืจื™?
    

Sample usage:

from transformers import AutoModelForCausalLM, AutoTokenizer
import torch

tokenizer = AutoTokenizer.from_pretrained('dicta-il/dictalm-7b-instruct')
model = AutoModelForCausalLM.from_pretrained('dicta-il/dictalm-7b-instruct', trust_remote_code=True).cuda()

model.eval()

with torch.inference_mode():
    prompt = 'ืชืฆื™ืข ื›ืžื” ืจืขื™ื•ื ื•ืช ืœืคืขื™ืœื•ืช ืขื ื™ืœื“ื™ื ื‘ื ื™ 5:\n'
    kwargs = dict(
        inputs=tokenizer(prompt, return_tensors='pt').input_ids.to(model.device),
        do_sample=True,
        top_k=50,
        top_p=0.95,
        temperature=0.75,
        max_length=100,
        min_new_tokens=5
    )
    
    print(tokenizer.batch_decode(model.generate(**kwargs), skip_special_tokens=True))

There are many different parameters you can input into kwargs for different results (greedy, beamsearch, different sampling configurations, longer/shorter respones, etc.).

You can view the full list of parameters you can pass to the generate function here.

Alternative ways to initialize the model:

If you have multiple smaller GPUs, and the package accelerate is installed, you can initialize the model split across the devices:

model = AutoModelForCausalLM.from_pretrained('dicta-il/dictalm-7b-instruct', trust_remote_code=True, device_map='auto')

If you are running on linux and have the bitsandbytes package installed, you can initialize the model in 4/8 bit inference mode:

model = AutoModelForCausalLM.from_pretrained('dicta-il/dictalm-7b-instruct', trust_remote_code=True, load_in_8bit=True)

If you have FlashAttention installed in your environment, you can instruct the model to use the flash attention implementation (either V1 or V2, whichever is installed):

model = AutoModelForCausalLM.from_pretrained('dicta-il/dictalm-7b-instruct', trust_remote_code=True, use_flash_attention=True)

Colab notebook demos

You can try the model on a free tier google colab using the following notebooks:

Citation

If you use DictaLM in your research, please cite DictaLM -- A Large Generative Language Model for Modern Hebrew

BibTeX:

@misc{shmidman2023introducing,
      title={Introducing DictaLM -- A Large Generative Language Model for Modern Hebrew}, 
      author={Shaltiel Shmidman and Avi Shmidman and Amir David Nissan Cohen and Moshe Koppel},
      year={2023},
      eprint={2309.14568},
      archivePrefix={arXiv},
      primaryClass={cs.CL}
}

License

Shield: CC BY 4.0

This work is licensed under a Creative Commons Attribution 4.0 International License.

CC BY 4.0

Downloads last month
502
Inference Examples
Inference API (serverless) has been turned off for this model.

Space using dicta-il/dictalm-7b-instruct 1