Edit model card

PegasusMedicalSummary

Authors

This model was created by mereshd, renegarza and jasmeeetsingh.

This model is a fine-tuned version of google/pegasus-xsum on the MTSamples dataset. It achieves the following results on the evaluation set:

  • Loss: 0.1438
  • Rouge1: 0.4318
  • Rouge2: 0.2525
  • Rougel: 0.3524
  • Rougelsum: 0.3525
  • Gen Len: 55.882

Project Purpose

Our goal is to deliver an effective summarization solution aimed at making doctor discharge notes more structured and comprehensive. A physician's job goes far beyond saving lives, doctors are also responsible for providing a comforting environment for their patients. With that in mind, while accommodating in a high-stress environment it is difficult to follow a structure and formulate notes with universal interpretability in mind. This leads to long and convoluted discharge documentation that becomes very tedious to leverage and understand. Our model is a product that will alleviate a significant amount of discomfort when creating and utilizing physician notes, which ultimately will lead to more fluid workflows and increased convenience for healthcare providers.

Intended Use

Model

We leveraged Google's Pegasus abstractive text summarization to generate summaries of the discharged transcriptions included in the MTSamples dataset. This was later utilized to prompt the Transformer's Masked Language Modeling(MLM) functionality to train the model to generate meaningful text with better structure and organization than the original.

Use Cases

This model allows for the efficient summarization of complexly documented doctor notes. It provides instant access to insight with proper semantic cues in place. Additionally, Data Engineers that work with patient electronic records consistently spend an excessive amount of time parsing through the unstructured discharge notes format to accomplish their tasks. The solution will be instrumental for agents who are not directly facing patients but hold back-end roles that are also of immense importance.

Limitations & Future Aspirations

With an increased amount of data, more deliberate results might be achieved through more training. Synthetic transcriptions could be created with GPT models to in turn train on. Also, further improvements on the model's summarization capabilities have been considered. One of which is implementing summarization based on clustered titles within the discharge notes. The feature would allow for easier traversal through partitioned summarization and result in better structure.

Training and evaluation data

The generated summaries were assigned to the original transcription and after splitting the data into the train and test sets, the table was converted into a json file. The structure allowed us to effectively train the model on the premise of transcription to summarization prompts. After all the metrics were evaluated, a number of medical transcriptions were generated through generative transformers to summarize and upon testing the model performed well.

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 2e-05
  • train_batch_size: 1
  • eval_batch_size: 1
  • seed: 42
  • gradient_accumulation_steps: 4
  • total_train_batch_size: 4
  • optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
  • lr_scheduler_type: linear
  • num_epochs: 4
  • mixed_precision_training: Native AMP

Training results

Training Loss Epoch Step Validation Loss Rouge1 Rouge2 Rougel Rougelsum Gen Len
6.5172 1.0 999 0.1784 0.4161 0.2373 0.3388 0.3384 52.102
0.3174 2.0 1999 0.1550 0.4236 0.2434 0.343 0.3428 54.458
0.2632 3.0 2999 0.1462 0.4269 0.2467 0.3465 0.3464 55.503
0.2477 4.0 3996 0.1438 0.4318 0.2525 0.3524 0.3525 55.882

Framework versions

  • Transformers 4.28.1
  • Pytorch 2.0.0+cu117
  • Datasets 2.11.0
  • Tokenizers 0.13.3
Downloads last month
1