Edit model card

Model Card for MedSymptomGPT

Model Details

Model Description

This is the model card for MedSymptomGPT, a model trained to understand and generate medical symptoms based on disease names. It uses the distilgpt2 architecture and has been fine-tuned on a dataset containing various diseases and their corresponding symptoms. The model is intended to assist in generating symptom lists for given diseases, aiding in medical research and educational purposes.

  • Developed by: RohitPatill
  • Shared by : Rohit Patil
  • Model type: GPT-2
  • Language(s) (NLP): English
  • Finetuned from model : distilbert/distilgpt2

Model Sources

Uses

Direct Use

This model can be directly used to generate symptoms associated with a given disease. This can be particularly useful in medical research, education, and healthcare applications where quick access to symptom information is valuable.

Downstream Use

This model can be fine-tuned further for more specific tasks related to medical text generation, such as creating detailed disease descriptions, patient information leaflets, or other medical documentation.

Out-of-Scope Use

This model should not be used for making clinical decisions or providing medical advice. It is intended for educational and research purposes only and should not replace professional medical judgment.

Bias, Risks, and Limitations

Recommendations

Users (both direct and downstream) should be aware of the potential biases in the training data, which may lead to biased outputs. It is important to validate the generated content with authoritative medical sources.

How to Get Started with the Model

Use the code below to get started with the model:

from transformers import GPT2Tokenizer, GPT2LMHeadModel

tokenizer = GPT2Tokenizer.from_pretrained('RohitPatill/MedSymptomGPT')
model = GPT2LMHeadModel.from_pretrained('RohitPatill/MedSymptomGPT')

input_str = "Kidney Failure"
input_ids = tokenizer.encode(input_str, return_tensors='pt')
output = model.generate(input_ids, max_length=50, num_return_sequences=1)
decoded_output = tokenizer.decode(output[0], skip_special_tokens=True)
print(decoded_output)

Training Details

Training Data

The model was trained on a dataset containing disease names and their corresponding symptoms. The dataset used for training was QuyenAnhDE/Diseases_Symptoms, which was preprocessed to format the data appropriately for training a language model.

Training Procedure

Preprocessing

  • The dataset was preprocessed to combine disease names and symptoms into a single string format.
  • The symptoms were extracted and joined into a single string for each disease.

Training Hyperparameters

  • Training regime: The model was trained using fp32 precision.
  • Batch size: 8
  • Learning rate: 5e-4
  • Number of epochs: 8

Evaluation

Testing Data, Factors & Metrics

Testing Data

The model was evaluated on a validation set, which was a split of the original training dataset. The validation set helps in monitoring the model performance during training.

Factors

The evaluation considered the ability of the model to generate coherent and relevant symptoms for a given disease name.

Metrics

The primary metric used for evaluation was the loss function (CrossEntropyLoss), which measures the difference between the predicted and actual symptoms.

Summary

The model was successfully trained and validated, showing its capability to generate relevant symptoms for given diseases. However, the exact loss values and other detailed metrics were not specified.

Bias, Risks, and Limitations

This model is trained on a dataset that may contain biases, and the outputs should be validated against authoritative medical sources. The model is intended for educational and research purposes only and should not be used for clinical decision-making.

Downloads last month
617
Safetensors
Model size
81.9M params
Tensor type
F32
·