har1's picture
Update README.md
13fe661 verified
metadata
license: mit
base_model: facebook/bart-large-cnn
tags:
  - generated_from_trainer
metrics:
  - rouge
model-index:
  - name: conversation-summ
    results: []
datasets:
  - har1/MTS_Dialogue-Clinical_Note
language:
  - en

HealthScribe (A Clinical Note Generator)

This model is a fine-tuned version of facebook/bart-large-cnn on a modified version of MTS-Dialog Dataset dataset.

Model description

The model was developed for the project HealthScirbe. This model is integrated with a Flask web application. The project is a web application that allows users to generate clinical notes from transcribed ASR(Automatic Speech Recognition) data of conversations between doctors and patients.

TEST DATA Sample For Inference (More given in test.txt)

You can refer test.txt for further examples of conversations.

"Doctor: Hi there, I love that dress, very pretty! 
Patient: Thank you for complementing a seventy-two-year-old patient.
Doctor: No, I mean it, seriously. Okay, so you were admitted here in May two thousand nine. You have a history of hypertension, and on June eighteenth two thousand nine you had bad abdominal pain diarrhea and cramps.
Patient: Yes, they told me I might have C Diff? They did a CT of my abdomen and that is when they thought I got the infection.
Doctor: Yes, it showed evidence of diffuse colitis, so I believe they gave you IV antibiotics?
Patient: Yes they did. 
Doctor: Yeah I see here, Flagyl and Levaquin. They started IV Reglan as well for your vomiting.
Patient: Yes, I was very nauseous. Vomited as well.
Doctor: After all this I still see your white blood cells high. Are you still nauseous? 
Patient: No, I do not have any nausea or vomiting, but still have diarrhea. Due to all that diarrhea I feel very weak.
Doctor: Okay. Anything else any other symptoms?
Patient: Actually no. Everything's well.
Doctor: Great.
Patient: Yeah."

Intended uses & limitations

The model is used to generate clinical notes from doctor-patient conversation data(ASR). This model has certain limitations like :

  • N/A output generation is low. Sometimes None is produced
  • When the input data is composed of very minimal character tokens or if input is very large it starts to hallucinate.

Training Metrics

Training and evaluation data

The model achieves the following results on the evaluation set:

  • Loss: 0.1562
  • Rouge1: 54.3238
  • Rouge2: 34.2678
  • Rougel: 46.5847
  • Rougelsum: 51.2214
  • Generation Length: 77.04

Training procedure

The model was trained on 1201 training samples and 100 validation samples of the modified MTS-Dialog

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 2e-05
  • train_batch_size: 1
  • eval_batch_size: 1
  • seed: 42
  • gradient_accumulation_steps: 2
  • total_train_batch_size: 2
  • optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
  • lr_scheduler_type: linear
  • num_epochs: 3
  • mixed_precision_training: Native AMP

Training results

Training Loss Epoch Step Validation Loss Rouge1 Rouge2 Rougel Rougelsum Gen Len
0.4426 1.0 600 0.1588 52.8864 33.253 44.9089 50.5072 69.38
0.1137 2.0 1201 0.1517 56.8499 35.309 48.2171 53.6983 72.74
0.0796 3.0 1800 0.1562 54.3238 34.2678 46.5847 51.2214 77.04

Framework versions

  • Transformers 4.39.2
  • Pytorch 2.2.1+cu121
  • Datasets 2.18.0
  • Tokenizers 0.15.2