Edit model card

Model Card

Base Model: facebook/bart-base

Fine-tuned : using PEFT-LoRa

Datasets : squad_v2, drop, mou3az/IT_QA-QG

Task: Generating questions from context and answers

Language: English

Loading the model

  from peft import PeftModel, PeftConfig
  from transformers import AutoModelForSeq2SeqLM, AutoTokenizer
  HUGGING_FACE_USER_NAME = "mou3az"
  model_name = "IT-General_Question-Generation "
  peft_model_id = f"{HUGGING_FACE_USER_NAME}/{model_name}"
  config = PeftConfig.from_pretrained(peft_model_id)
  model = AutoModelForSeq2SeqLM.from_pretrained(config.base_model_name_or_path, return_dict=True, load_in_8bit=False, device_map='auto')
  QG_tokenizer = AutoTokenizer.from_pretrained(config.base_model_name_or_path)
  QG_model = PeftModel.from_pretrained(model, peft_model_id)

At inference time

  def get_question(context, answer):
      device = next(QG_model.parameters()).device
      input_text = f"Given the context '{context}' and the answer '{answer}', what question can be asked?"
      encoding = QG_tokenizer.encode_plus(input_text, padding=True, return_tensors="pt").to(device)
  
      output_tokens = QG_model.generate(**encoding, early_stopping=True, num_beams=5, num_return_sequences=1, no_repeat_ngram_size=2, max_length=100)
      out = QG_tokenizer.decode(output_tokens[0], skip_special_tokens=True).replace("question:", "").strip()
  
      return out

Training parameters and hyperparameters

The following were used during training:

For Lora:

r=18

alpha=8

For training arguments:

gradient_accumulation_steps=24

per_device_train_batch_size=8

per_device_eval_batch_size=8

max_steps=1000

warmup_steps=50

weight_decay=0.05

learning_rate=3e-3

lr_scheduler_type="linear"

Training Results

Epoch Optimization Step Training Loss Validation Loss
0.0 84 4.6426 4.704238
3.0 252 1.5094 1.202135
6.0 504 1.2677 1.146177
9.0 756 1.2613 1.112074
12.0 1000 1.1958 1.109059

Performance Metrics on Evaluation Set:

Training Loss: 1.1.1958

Evaluation Loss: 1.109059

Bertscore: 0.8123

Rouge: 0.532144

Fuzzywizzy similarity: 0.74209
Downloads last month
0
Unable to determine this model’s pipeline type. Check the docs .

Datasets used to train mou3az/IT-General_Question-Generation