library_name: transformers
license: apache-2.0
Model Card for MedSymptomGPT
Model Details
Model Description
This is the model card for MedSymptomGPT, a model trained to understand and generate medical symptoms based on disease names. It uses the distilgpt2
architecture and has been fine-tuned on a dataset containing various diseases and their corresponding symptoms. The model is intended to assist in generating symptom lists for given diseases, aiding in medical research and educational purposes.
- Developed by: RohitPatill
- Shared by : Rohit Patil
- Model type: GPT-2
- Language(s) (NLP): English
- Finetuned from model :
distilbert/distilgpt2
Model Sources
- Repository: MedSymptomGPT Repository
Uses
Direct Use
This model can be directly used to generate symptoms associated with a given disease. This can be particularly useful in medical research, education, and healthcare applications where quick access to symptom information is valuable.
Downstream Use
This model can be fine-tuned further for more specific tasks related to medical text generation, such as creating detailed disease descriptions, patient information leaflets, or other medical documentation.
Out-of-Scope Use
This model should not be used for making clinical decisions or providing medical advice. It is intended for educational and research purposes only and should not replace professional medical judgment.
Bias, Risks, and Limitations
Recommendations
Users (both direct and downstream) should be aware of the potential biases in the training data, which may lead to biased outputs. It is important to validate the generated content with authoritative medical sources.
How to Get Started with the Model
Use the code below to get started with the model:
from transformers import GPT2Tokenizer, GPT2LMHeadModel
tokenizer = GPT2Tokenizer.from_pretrained('RohitPatill/MedSymptomGPT')
model = GPT2LMHeadModel.from_pretrained('RohitPatill/MedSymptomGPT')
input_str = "Kidney Failure"
input_ids = tokenizer.encode(input_str, return_tensors='pt')
output = model.generate(input_ids, max_length=50, num_return_sequences=1)
decoded_output = tokenizer.decode(output[0], skip_special_tokens=True)
print(decoded_output)
Training Details
Training Data
The model was trained on a dataset containing disease names and their corresponding symptoms. The dataset used for training was QuyenAnhDE/Diseases_Symptoms
, which was preprocessed to format the data appropriately for training a language model.
Training Procedure
Preprocessing
- The dataset was preprocessed to combine disease names and symptoms into a single string format.
- The symptoms were extracted and joined into a single string for each disease.
Training Hyperparameters
- Training regime: The model was trained using
fp32
precision. - Batch size: 8
- Learning rate: 5e-4
- Number of epochs: 8
Evaluation
Testing Data, Factors & Metrics
Testing Data
The model was evaluated on a validation set, which was a split of the original training dataset. The validation set helps in monitoring the model performance during training.
Factors
The evaluation considered the ability of the model to generate coherent and relevant symptoms for a given disease name.
Metrics
The primary metric used for evaluation was the loss function (CrossEntropyLoss
), which measures the difference between the predicted and actual symptoms.
Summary
The model was successfully trained and validated, showing its capability to generate relevant symptoms for given diseases. However, the exact loss values and other detailed metrics were not specified.
Bias, Risks, and Limitations
This model is trained on a dataset that may contain biases, and the outputs should be validated against authoritative medical sources. The model is intended for educational and research purposes only and should not be used for clinical decision-making.