Introduction: The fine-tuned model is based on the davidfred/Qwen2.5-32B pre-trained language model. It has been fine-tuned using the provided code (fine.py) to specialize in answering questions related to Israeli law. The model is capable of generating concise and relevant answers in Hebrew while referencing relevant legal cases and legislation. Model Details: Base Model: davidfred/Qwen2.5-32B Fine-tuned Model: Qwen2.5-32BHeb Training Data: Processed Wikipedia dataset (/home/azureuser/fredlebonexperim002/extracted_text) Training Configuration: 4-bit quantization using BitsAndBytesConfig LoRA (Low-Rank Adaptation) with r=16, lora_alpha=32, lora_dropout=0.05 Training hyperparameters: Batch size: 8 per device Gradient accumulation steps: 4 Learning rate: 1e-4 Number of epochs: 1 Optimizer: AdamW LR scheduler: Cosine with warmup How to Use the Model: Install the required dependencies: torch transformers datasets peft trl Load the fine-tuned model and tokenizer: python
from transformers import AutoModelForCausalLM, AutoTokenizer
model_name = "Qwen2.5-32B-lawmew" model = AutoModelForCausalLM.from_pretrained(model_name) tokenizer = AutoTokenizer.from_pretrained(model_name) Define the prompt template for asking questions: python
PROMPT_GUIDE = """ 讛谞讞讬讜转 诇诪注谞讛 注诇 讛砖讗诇讛:
- 讛转砖讜讘讛 讞讬讬讘转 诇讛讬讜转 专拽 讘注讘专讬转.
- 转谉 转砖讜讘讛 拽爪专讛, 诪诪讜拽讚转 讜讘专讜专讛.
- 讛转讬讬讞住 讬砖讬专讜转 诇砖讗诇讛 砖谞砖讗诇讛.
砖讗诇讛: {question}
转砖讜讘讛: """ Generate text using the model: python
def generate_text(prompt, max_length=1024, temperature=0.7, top_p=0.92, top_k=50): instruction = PROMPT_GUIDE.format(question=prompt) input_ids = tokenizer(instruction, return_tensors='pt').input_ids
output_ids = model.generate(
input_ids=input_ids,
max_length=max_length,
num_return_sequences=1,
do_sample=True,
top_p=top_p,
top_k=top_k,
temperature=temperature,
eos_token_id=tokenizer.eos_token_id,
pad_token_id=tokenizer.pad_token_id,
)
generated_text = tokenizer.decode(output_ids[0], skip_special_tokens=True)
response = generated_text[len(instruction):].strip()
return response
Ask a question and get the model's response: python
question = "诪讛诐 讛转谞讗讬诐 诇拽讘诇转 讗讝专讞讜转 讬砖专讗诇讬转?" response = generate_text(question) print(response) The model will generate a concise answer in Hebrew, referencing relevant legal cases and legislation based on the provided question.
- Downloads last month
- 1