renegarza's picture
Update README.md
8d5d8ac
metadata
tags:
  - generated_from_trainer
metrics:
  - rouge
model-index:
  - name: PegasusMedicalSummary
    results: []
widget:
  - text: >-
      PREOPERATIVE DIAGNOSIS: Chronic obstructive pulmonary disease
      (COPD).POSTOPERATIVE DIAGNOSIS: COPD.PROCEDURE: Bilateral video-assisted
      thoracoscopic lung volume reduction surgery (LVRS).ANESTHESIA: General
      anesthesia with single-lumen endotracheal tube.INDICATIONS FOR PROCEDURE:
      This 65-year-old female patient presented with severe COPD symptoms,
      including dyspnea and decreased exercise tolerance. After thorough
      evaluation and discussions of available treatment options, the decision
      for bilateral LVRS was made in order to improve lung function and quality
      of life.PROCEDURE IN DETAIL: Informed consent was obtained after
      explaining the risks and benefits of the procedure. The patient was placed
      in a lateral decubitus position, and general anesthesia was induced.
      Bilateral LVRS was performed using video-assisted thoracoscopic
      techniques. Intraoperatively, attention was given to minimize bleeding and
      ensure proper lung tissue removal. The patient tolerated the procedure
      well, and postoperative care instructions were provided.
    example_title: Example 1
  - text: >-
      PREOPERATIVE DIAGNOSIS: Coronary artery disease.POSTOPERATIVE DIAGNOSIS:
      Coronary artery disease.PROCEDURE: Coronary artery bypass grafting (CABG)
      surgery.ANESTHESIA: General anesthesia with cardiopulmonary
      bypass.INDICATIONS FOR PROCEDURE: This 60-year-old male patient presented
      with significant coronary artery disease, with multiple vessels showing
      significant stenosis on angiography. After a thorough evaluation of his
      condition and considering the extent of the disease, the decision was made
      to proceed with CABG surgery to improve blood flow to the heart
      muscle.PROCEDURE IN DETAIL: After obtaining informed consent and ensuring
      adequate preoperative preparations, the patient was brought to the
      operating room. General anesthesia was induced, and cardiopulmonary bypass
      was established. The bypass grafts were harvested, and the stenotic
      coronary arteries were bypassed using appropriate grafts. Hemostasis was
      ensured, and the patient was weaned off cardiopulmonary bypass. The
      patient was transferred to the intensive care unit for postoperative
      monitoring and recovery. Postoperative care instructions were provided to
      the patient and family members.
    example_title: Example 2
  - text: >-
      PREOPERATIVE DIAGNOSIS: Lumbar disc herniation.POSTOPERATIVE DIAGNOSIS:
      Lumbar disc herniation.PROCEDURE: Minimally invasive lumbar
      microdiscectomy.ANESTHESIA: General anesthesia with endotracheal
      intubation.INDICATIONS FOR PROCEDURE: This 42-year-old male patient
      presented with radiating low back pain and leg numbness, along with
      positive imaging findings of a lumbar disc herniation. After conservative
      treatment failed to provide relief, the decision was made to proceed with
      a minimally invasive microdiscectomy to alleviate the symptoms.PROCEDURE
      IN DETAIL: The patient was positioned prone on the operating table, and
      general anesthesia was administered. A small incision was made, and using
      fluoroscopic guidance, the herniated disc material was carefully removed.
      The surgical site was inspected for any bleeding or complications before
      closure. The patient was awakened from anesthesia without any immediate
      postoperative complications. Postoperative instructions were given
      regarding activity restrictions and pain management.
    example_title: Example 3

PegasusMedicalSummary

Authors

This model was created by mereshd, renegarza and jasmeeetsingh.

This model is a fine-tuned version of google/pegasus-xsum on the MTSamples dataset. It achieves the following results on the evaluation set:

  • Loss: 0.1438
  • Rouge1: 0.4318
  • Rouge2: 0.2525
  • Rougel: 0.3524
  • Rougelsum: 0.3525
  • Gen Len: 55.882

Project Purpose

Our goal is to deliver an effective summarization solution aimed at making doctor discharge notes more structured and comprehensive. A physician's job goes far beyond saving lives, doctors are also responsible for providing a comforting environment for their patients. With that in mind, while accommodating in a high-stress environment it is difficult to follow a structure and formulate notes with universal interpretability in mind. This leads to long and convoluted discharge documentation that becomes very tedious to leverage and understand. Our model is a product that will alleviate a significant amount of discomfort when creating and utilizing physician notes, which ultimately will lead to more fluid workflows and increased convenience for healthcare providers.

Intended Use

Model

We leveraged Google's Pegasus abstractive text summarization to generate summaries of the discharged transcriptions included in the MTSamples dataset. This was later utilized to prompt the Transformer's Masked Language Modeling(MLM) functionality to train the model to generate meaningful text with better structure and organization than the original.

Use Cases

This model allows for the efficient summarization of complexly documented doctor notes. It provides instant access to insight with proper semantic cues in place. Additionally, Data Engineers that work with patient electronic records consistently spend an excessive amount of time parsing through the unstructured discharge notes format to accomplish their tasks. The solution will be instrumental for agents who are not directly facing patients but hold back-end roles that are also of immense importance.

Limitations & Future Aspirations

With an increased amount of data, more deliberate results might be achieved through more training. Synthetic transcriptions could be created with GPT models to in turn train on. Also, further improvements on the model's summarization capabilities have been considered. One of which is implementing summarization based on clustered titles within the discharge notes. The feature would allow for easier traversal through partitioned summarization and result in better structure.

Training and evaluation data

The generated summaries were assigned to the original transcription and after splitting the data into the train and test sets, the table was converted into a json file. The structure allowed us to effectively train the model on the premise of transcription to summarization prompts. After all the metrics were evaluated, a number of medical transcriptions were generated through generative transformers to summarize and upon testing the model performed well.

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 2e-05
  • train_batch_size: 1
  • eval_batch_size: 1
  • seed: 42
  • gradient_accumulation_steps: 4
  • total_train_batch_size: 4
  • optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
  • lr_scheduler_type: linear
  • num_epochs: 4
  • mixed_precision_training: Native AMP

Training results

Training Loss Epoch Step Validation Loss Rouge1 Rouge2 Rougel Rougelsum Gen Len
6.5172 1.0 999 0.1784 0.4161 0.2373 0.3388 0.3384 52.102
0.3174 2.0 1999 0.1550 0.4236 0.2434 0.343 0.3428 54.458
0.2632 3.0 2999 0.1462 0.4269 0.2467 0.3465 0.3464 55.503
0.2477 4.0 3996 0.1438 0.4318 0.2525 0.3524 0.3525 55.882

Framework versions

  • Transformers 4.28.1
  • Pytorch 2.0.0+cu117
  • Datasets 2.11.0
  • Tokenizers 0.13.3