--- library_name: transformers license: apache-2.0 --- # Model Card for MedSymptomGPT ## Model Details ### Model Description This is the model card for MedSymptomGPT, a model trained to understand and generate medical symptoms based on disease names. It uses the `distilgpt2` architecture and has been fine-tuned on a dataset containing various diseases and their corresponding symptoms. The model is intended to assist in generating symptom lists for given diseases, aiding in medical research and educational purposes. - **Developed by:** RohitPatill - **Shared by :** Rohit Patil - **Model type:** GPT-2 - **Language(s) (NLP):** English - **Finetuned from model :** `distilbert/distilgpt2` ### Model Sources - **Repository:** [MedSymptomGPT Repository](https://huggingface.co/RohitPatill/MedSymptomGPT) ## Uses ### Direct Use This model can be directly used to generate symptoms associated with a given disease. This can be particularly useful in medical research, education, and healthcare applications where quick access to symptom information is valuable. ### Downstream Use This model can be fine-tuned further for more specific tasks related to medical text generation, such as creating detailed disease descriptions, patient information leaflets, or other medical documentation. ### Out-of-Scope Use This model should not be used for making clinical decisions or providing medical advice. It is intended for educational and research purposes only and should not replace professional medical judgment. ## Bias, Risks, and Limitations ### Recommendations Users (both direct and downstream) should be aware of the potential biases in the training data, which may lead to biased outputs. It is important to validate the generated content with authoritative medical sources. ## How to Get Started with the Model Use the code below to get started with the model: ```python from transformers import GPT2Tokenizer, GPT2LMHeadModel tokenizer = GPT2Tokenizer.from_pretrained('RohitPatill/MedSymptomGPT') model = GPT2LMHeadModel.from_pretrained('RohitPatill/MedSymptomGPT') input_str = "Kidney Failure" input_ids = tokenizer.encode(input_str, return_tensors='pt') output = model.generate(input_ids, max_length=50, num_return_sequences=1) decoded_output = tokenizer.decode(output[0], skip_special_tokens=True) print(decoded_output) ``` ## Training Details ### Training Data The model was trained on a dataset containing disease names and their corresponding symptoms. The dataset used for training was `QuyenAnhDE/Diseases_Symptoms`, which was preprocessed to format the data appropriately for training a language model. ### Training Procedure #### Preprocessing - The dataset was preprocessed to combine disease names and symptoms into a single string format. - The symptoms were extracted and joined into a single string for each disease. #### Training Hyperparameters - **Training regime:** The model was trained using `fp32` precision. - **Batch size:** 8 - **Learning rate:** 5e-4 - **Number of epochs:** 8 ## Evaluation ### Testing Data, Factors & Metrics #### Testing Data The model was evaluated on a validation set, which was a split of the original training dataset. The validation set helps in monitoring the model performance during training. #### Factors The evaluation considered the ability of the model to generate coherent and relevant symptoms for a given disease name. #### Metrics The primary metric used for evaluation was the loss function (`CrossEntropyLoss`), which measures the difference between the predicted and actual symptoms. ### Summary The model was successfully trained and validated, showing its capability to generate relevant symptoms for given diseases. However, the exact loss values and other detailed metrics were not specified. ## Bias, Risks, and Limitations This model is trained on a dataset that may contain biases, and the outputs should be validated against authoritative medical sources. The model is intended for educational and research purposes only and should not be used for clinical decision-making.