library_name: transformers tags: - trl - sft - quantization - 4bit - lora --- # Model Card for Medical Transcription Model (Gemma-MedTr) This model is a fine-tuned variant of `Gemma-2-2b`, optimized for medical transcription tasks with efficient 4-bit quantization and Low-Rank Adaptation (LoRA). It handles transcription processing, keyword extraction, and medical specialty classification. ## Model Details - **Developed by:** Harish Nair - **Organization:** University of Ottawa - **License:** Apache 2.0 - **Fine-tuned from:** [Gemma-2-2b](https://huggingface.co/google/gemma-2-2b) - **Model type:** Transformer-based language model for medical transcription processing - **Language(s):** English ### Training Details - **Training Loss:** Final training loss at step 10: 1.4791 - **Training Configuration:** - LoRA with `r=8`, targeting specific transformer modules for adaptation. - 4-bit quantization using `nf4` quantization type and `bfloat16` compute precision. - **Training Runtime:** 20.85 seconds, with approximately 1.92 samples processed per second. ## How to Use To load and use this model, initialize it with the following configuration: ```python import pandas as pd from transformers import AutoTokenizer, AutoModelForCausalLM, BitsAndBytesConfig from peft import LoraConfig, PeftModel model_id = "google/gemma-2-2b" bnb_config = BitsAndBytesConfig( load_in_4bit=True, bnb_4bit_quant_type="nf4", bnb_4bit_compute_dtype=torch.bfloat16 ) tokenizer = AutoTokenizer.from_pretrained(model_id, token=access_token_read) model = AutoModelForCausalLM.from_pretrained(model_id, quantization_config=bnb_config, device_map='auto', token=access_token_read) '''