Update README.md (#1)
Browse files- Update README.md (d2c89d52baa71eab860dd7545b3919c5207f5285)
Co-authored-by: Danil Meresh <mereshd@users.noreply.huggingface.co>
README.md
CHANGED
@@ -32,19 +32,33 @@ It achieves the following results on the evaluation set:
|
|
32 |
- Rougelsum: 0.3525
|
33 |
- Gen Len: 55.882
|
34 |
|
35 |
-
|
36 |
-
|
37 |
-
|
38 |
-
|
39 |
-
|
40 |
-
|
41 |
-
|
42 |
-
|
43 |
-
|
44 |
-
|
45 |
-
|
46 |
-
|
47 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
48 |
|
49 |
### Training hyperparameters
|
50 |
|
|
|
32 |
- Rougelsum: 0.3525
|
33 |
- Gen Len: 55.882
|
34 |
|
35 |
+
### Project Purpose
|
36 |
+
|
37 |
+
Our goal is to deliver an effective summarization solution aimed at making doctor discharge notes more structured and comprehensive.
|
38 |
+
A physician's job goes far beyond saving lives, doctors are also responsible for providing a comforting environment for their patients.
|
39 |
+
With that in mind, while accommodating in a high-stress environment it is difficult to follow a structure and formulate notes with universal interpretability in mind.
|
40 |
+
This leads to long and convoluted discharge documentation that becomes very tedious to leverage and understand. Our model is a product that will alleviate
|
41 |
+
a significant amount of discomfort when creating and utilizing physician notes, which ultimately will lead to more fluid workflows and increased convenience for healthcare providers.
|
42 |
+
|
43 |
+
### Intended Use
|
44 |
+
#### Model
|
45 |
+
We leveraged Google's Pegasus abstractive text summarization to generate summaries of the discharged transcriptions included in the MTSamples dataset.
|
46 |
+
This was later utilized to prompt the Transformer's Masked Language Modeling(MLM) functionality to train the model to generate meaningful text with better structure and organization than the original.
|
47 |
+
Additionally, Data Engineers that work with patient electronic records consistently spend an excessive amount of time parsing through the unstructured discharge notes format to accomplish their tasks.
|
48 |
+
The solution will be instrumental for agents who are not directly facing patients but hold back-end roles that are also of immense importance.
|
49 |
+
|
50 |
+
Data Engineer?
|
51 |
+
#### Use Cases
|
52 |
+
This model allows for the efficient summarization of complexly documented doctor notes. It provides instant access to insight with proper semantic cues in place.
|
53 |
+
|
54 |
+
##### Limitations & Future Aspirations
|
55 |
+
With an increased amount of data, more deliberate results might be achieved through more training. Also, further improvements on the model's summarization capabilities have been considered.
|
56 |
+
One of which is implementing summarization based on clustered titles within the discharge notes. The feature would allow for easier traversal through partitioned summarization and result in better structure.
|
57 |
+
|
58 |
+
##### Training and evaluation data
|
59 |
+
The generated summaries were assigned to the original transcription and after splitting the data into the train and test sets, the table was converted into a json file.
|
60 |
+
The structure allowed us to effectively train the model on the premise of transcription to summarization prompts. After all the metrics were evaluated, a number of medical transcriptions were generated through
|
61 |
+
generative transformers to summarize and upon testing the model performed well.
|
62 |
|
63 |
### Training hyperparameters
|
64 |
|