Edit model card

Model Card for Enhanced Language Model with LoRA

Model Description

This model, a LoRA-finetuned language model, is based on beomi/ko-gemma-2b. It was trained using the lbox/lbox_open and ljp_criminal datasets, specifically prepared by merging facts fields with ruling.text. This training approach aims to enhance the model's capability to understand and generate legal and factual text sequences. The fine-tuning was performed on two A100 GPUs.

LoRA Configuration

  • LoRA Alpha: 32
  • Rank (r): 16
  • LoRA Dropout: 0.05%
  • Bias Configuration: None
  • Targeted Modules:
    • Query Projection (q_proj)
    • Key Projection (k_proj)
    • Value Projection (v_proj)
    • Output Projection (o_proj)
    • Gate Projection (gate_proj)
    • Up Projection (up_proj)
    • Down Projection (down_proj)

Training Configuration

  • Training Epochs: 1
  • Batch Size per Device: 2
  • Optimizer: Optimized AdamW with paged 32-bit precision
  • Learning Rate: 0.00005
  • Max Gradient Norm: 0.3
  • Learning Rate Scheduler: Constant
  • Warm-up Steps: 100
  • Gradient Accumulation Steps: 1

Model Training and Evaluation

The model was trained and evaluated using the SFTTrainer with the following parameters:

  • Max Sequence Length: 4096
  • Dataset Text Field: training_text
  • Packing: Disabled

How to Get Started with the Model

Use the following code snippet to load the model with Hugging Face Transformers:

from transformers import AutoModelForCausalLM, AutoTokenizer

model = AutoModelForCausalLM.from_pretrained("your_model_id")
tokenizer = AutoTokenizer.from_pretrained("your_model_id")

# Example usage
inputs = tokenizer("Example input text", return_tensors="pt")
outputs = model.generate(**inputs)
print(tokenizer.decode(outputs[0]))
Downloads last month
7
Safetensors
Model size
2.51B params
Tensor type
F32
·
Inference API
Model is too large to load in Inference API (serverless). To try the model, launch it on Inference Endpoints (dedicated) instead.