|
--- |
|
license: apache-2.0 |
|
pipeline_tag: text-generation |
|
language: |
|
- en |
|
- he |
|
tags: |
|
- pretrained |
|
inference: |
|
parameters: |
|
temperature: 0.7 |
|
--- |
|
|
|
[<img src="dicta-logo.jpg" width="300px"/>](https://dicta.org.il) |
|
|
|
|
|
# Model Card for DictaLM-2.0 |
|
|
|
The DictaLM-2.0 Large Language Model (LLM) is a pretrained generative text model with 7 billion parameters specializing in Hebrew. |
|
|
|
For full details of this model please read our [release blog post](https://example.com). |
|
|
|
## Example Code |
|
|
|
```python |
|
from transformers import pipeline |
|
import torch |
|
|
|
# This loads the model onto the GPU in bfloat16 precision |
|
model = pipeline('text-generation', 'dicta-il/dictalm2.0', torch_dtype=torch.bfloat16, device_map='cuda') |
|
|
|
# Sample few shot examples |
|
prompt = """ |
|
注讘专: 讛诇讻转讬 |
|
注转讬讚: 讗诇讱 |
|
|
|
注讘专: 砖诪专转讬 |
|
注转讬讚: 讗砖诪讜专 |
|
|
|
注讘专: 砖诪注转讬 |
|
注转讬讚: 讗砖诪注 |
|
|
|
注讘专: 讛讘谞转讬 |
|
注转讬讚: |
|
""" |
|
|
|
print(model(prompt.strip(), do_sample=False, max_new_tokens=8, stop_sequence='\n')) |
|
# [{'generated_text': '注讘专: 讛诇讻转讬\n注转讬讚: 讗诇讱\n\n注讘专: 砖诪专转讬\n注转讬讚: 讗砖诪讜专\n\n注讘专: 砖诪注转讬\n注转讬讚: 讗砖诪注\n\n注讘专: 讛讘谞转讬\n注转讬讚: 讗讘讬谉\n\n'}] |
|
``` |
|
|
|
## Example Code - 4-Bit |
|
|
|
There are already pre-quantized 4-bit models using the `GPTQ` and `AWQ` methods available for use: [DictaLM-2.0-AWQ](https://huggingface.co/dicta-il/dictalm2.0-AWQ) and [DictaLM-2.0-GPTQ](https://huggingface.co/dicta-il/dictalm2.0-GPTQ). |
|
|
|
For dynamic quantization on the go, here is sample code which loads the model onto the GPU using the `bitsandbytes` package, requiring : |
|
|
|
```python |
|
from transformers import AutoModelForCausalLM, AutoTokenizer |
|
import torch |
|
|
|
model = AutoModelForCausalLM.from_pretrained('dicta-il/dictalm2.0', torch_dtype=torch.bfloat16, device_map='cuda', load_in_4bit=True) |
|
tokenizer = AutoTokenizer.from_pretrained('dicta-il/dictalm2.0') |
|
|
|
prompt = """ |
|
注讘专: 讛诇讻转讬 |
|
注转讬讚: 讗诇讱 |
|
|
|
注讘专: 砖诪专转讬 |
|
注转讬讚: 讗砖诪讜专 |
|
|
|
注讘专: 砖诪注转讬 |
|
注转讬讚: 讗砖诪注 |
|
|
|
注讘专: 讛讘谞转讬 |
|
注转讬讚: |
|
""" |
|
|
|
encoded = tokenizer(prompt.strip(), return_tensors='pt').to(model.device) |
|
print(tokenizer.batch_decode(model.generate(**encoded, do_sample=False, max_new_tokens=4))) |
|
# ['<s> 注讘专: 讛诇讻转讬\n注转讬讚: 讗诇讱\n\n注讘专: 砖诪专转讬\n注转讬讚: 讗砖诪讜专\n\n注讘专: 砖诪注转讬\n注转讬讚: 讗砖诪注\n\n注讘专: 讛讘谞转讬\n注转讬讚: 讗讘讬谉\n\n'] |
|
``` |
|
|
|
|
|
## Model Architecture |
|
|
|
DictaLM-2.0 is based on the [Mistral-7B-v0.1](https://huggingface.co/mistralai/Mistral-7B-v0.1) model with the following changes: |
|
- An extended tokenizer with tokens for Hebrew, increasing the compression ratio |
|
- Continued pretraining on over 190B tokens of naturally occuring text, 50% Hebrew and 50% English. |
|
|
|
## Notice |
|
|
|
DictaLM 2.0 is a pretrained base model and therefore does not have any moderation mechanisms. |
|
|
|
## Citation |
|
|
|
If you use this model, please cite: |
|
|
|
```bibtex |
|
[Will be added soon] |
|
``` |