YAML Metadata Warning: empty or missing yaml metadata in repo card (https://huggingface.co/docs/hub/model-cards#model-card-metadata)

Introduction: The fine-tuned model is based on the davidfred/Qwen2.5-32B pre-trained language model. It has been fine-tuned using the provided code (fine.py) to specialize in answering questions related to Israeli law. The model is capable of generating concise and relevant answers in Hebrew while referencing relevant legal cases and legislation. Model Details: Base Model: davidfred/Qwen2.5-32B Fine-tuned Model: Qwen2.5-32BHeb Training Data: Processed Wikipedia dataset (/home/azureuser/fredlebonexperim002/extracted_text) Training Configuration: 4-bit quantization using BitsAndBytesConfig LoRA (Low-Rank Adaptation) with r=16, lora_alpha=32, lora_dropout=0.05 Training hyperparameters: Batch size: 8 per device Gradient accumulation steps: 4 Learning rate: 1e-4 Number of epochs: 1 Optimizer: AdamW LR scheduler: Cosine with warmup How to Use the Model: Install the required dependencies: torch transformers datasets peft trl Load the fine-tuned model and tokenizer: python

from transformers import AutoModelForCausalLM, AutoTokenizer

model_name = "Qwen2.5-32B-lawmew" model = AutoModelForCausalLM.from_pretrained(model_name) tokenizer = AutoTokenizer.from_pretrained(model_name) Define the prompt template for asking questions: python

PROMPT_GUIDE = """ 讛谞讞讬讜转 诇诪注谞讛 注诇 讛砖讗诇讛:

  1. 讛转砖讜讘讛 讞讬讬讘转 诇讛讬讜转 专拽 讘注讘专讬转.
  2. 转谉 转砖讜讘讛 拽爪专讛, 诪诪讜拽讚转 讜讘专讜专讛.
  3. 讛转讬讬讞住 讬砖讬专讜转 诇砖讗诇讛 砖谞砖讗诇讛.

砖讗诇讛: {question}

转砖讜讘讛: """ Generate text using the model: python

def generate_text(prompt, max_length=1024, temperature=0.7, top_p=0.92, top_k=50): instruction = PROMPT_GUIDE.format(question=prompt) input_ids = tokenizer(instruction, return_tensors='pt').input_ids

output_ids = model.generate(
    input_ids=input_ids,
    max_length=max_length,
    num_return_sequences=1,
    do_sample=True,
    top_p=top_p,
    top_k=top_k,
    temperature=temperature,
    eos_token_id=tokenizer.eos_token_id,
    pad_token_id=tokenizer.pad_token_id,
)

generated_text = tokenizer.decode(output_ids[0], skip_special_tokens=True)
response = generated_text[len(instruction):].strip()
return response

Ask a question and get the model's response: python

question = "诪讛诐 讛转谞讗讬诐 诇拽讘诇转 讗讝专讞讜转 讬砖专讗诇讬转?" response = generate_text(question) print(response) The model will generate a concise answer in Hebrew, referencing relevant legal cases and legislation based on the provided question.

Downloads last month
1
Safetensors
Model size
17.6B params
Tensor type
F32
U8
Inference API
Unable to determine this model's library. Check the docs .