--- library_name: transformers datasets: - 9rofe/patient_handout_AAFP_reading_levels language: - en license: cc-by-nc-3.0 --- # Model Card for AI-Driven Health Literacy Simplification Model This model simplifies complex medical texts to a 6th-grade reading level, enhancing health literacy among patients with low health literacy. ## Model Details ### Model Description This model uses advanced natural language processing (NLP) algorithms to translate complex medical information into a format that is accessible to individuals with a 6th-grade reading level. The goal is to improve comprehension and health outcomes for patients with low health literacy. - **Developed by:** Wernicke AI - **Funded by:** ME [More Information Needed] - **Shared by:** [More Information Needed] - **Model type:** Text Simplification - **Language(s) (NLP):** English - **License:** Creative Commons Attribution Non-Commercial 3.0 - **Finetuned from model:** tiiuae/falcon-40b ## Uses ### Direct Use The model can be used directly to simplify patient education materials to improve accessibility and comprehension. ### Downstream Use The model can be integrated into healthcare platforms and patient portals to provide simplified information, aiding patients in understanding their medical conditions and treatment plans. ### Out-of-Scope Use The model should not be used for generating medical advice or instructions without proper validation from healthcare professionals to avoid misinformation. ## Bias, Risks, and Limitations The model may not fully capture all nuances of medical information, leading to oversimplification or loss of critical details. There is also a risk of bias in the training data affecting the output. ### Recommendations Users should validate the simplified text with healthcare professionals to ensure accuracy and completeness of the information. ## How to Get Started with the Model Use the code below to get started with the model. ```python from transformers import ( AutoConfig, AutoModelForCausalLM, AutoTokenizer, BitsAndBytesConfig ) from peft import PeftConfig MODEL = "9rofe/Wernicke-AI3" bnb_config = BitsAndBytesConfig( load_in_4bit=True, bnb_4bit_use_double_quant=True, bnb_4bit_quant_type="nf4", bnb_4bit_compute_dtype=torch.bfloat16 ) config = PeftConfig.from_pretrained(MODEL) model = AutoModelForCausalLM.from_pretrained( config.base_model_name_or_path, return_dict=True, quantization_config=bnb_config, device_map="auto", trust_remote_code=True ) tokenizer=AutoTokenizer.from_pretrained(config.base_model_name_or_path) tokenizer.pad_token = tokenizer.eos_token model = PeftModel.from_pretrained(model, MODEL) generation_config = model.generation_config generation_config.max_new_tokens = 500 # MODIFY generation_config.temperature = 0.7 generation_config.top_p = 0.7 generation_config.num_return_sequences = 1 generation_config.pad_token_id = tokenizer.eos_token_id generation_config.eos_token_id = tokenizer.eos_token_id %%time device = "cuda:0" prompt = """ : Convert this text to reading level 6: {TEXT} : """.strip() encoding = tokenizer(prompt, return_tensors="pt").to(device) with torch.inference_mode(): outputs = model.generate( input_ids = encoding.input_ids, attention_mask = encoding.attention_mask, generation_config = generation_config ) print(tokenizer.decode(outputs[0], skip_special_tokens=True)) ``` Utilize this prompt: ```python prompt = """ : Convert this text to reading level 6: {TEXT} : """.strip() ``` ## Training Details ### Training Data The model was trained on a comprehensive dataset of medical texts, including patient handouts and educational materials, processed to ensure readability compliance with NIH and AMA guidelines. ### Training Procedure #### Preprocessing Medical texts were preprocessed using readability assessments such as SMOG, Flesch-Kincaid, and Gunning Fog to ensure the dataset's appropriateness for training the simplification model. #### Training Hyperparameters - **Training regime:** Training regime: fp16 mixed precision Optimizer: AdamW Learning rate: 5e-5 Batch size: 32 #### Speeds, Sizes, Times Training was conducted over 10 epochs, with checkpoints saved at regular intervals to monitor progress and performance. ## Evaluation ### Testing Data, Factors & Metrics #### Testing Data The testing data comprised patient-centered materials not included in the training set, evaluated for readability and comprehension improvement. #### Factors Evaluation factors included readability scores and patient comprehension levels. #### Metrics Metrics included SMOG, Flesch-Kincaid, and Gunning Fog scores, along with patient comprehension assessment through usability testing. ### Results The model demonstrated significant improvement in readability scores and patient comprehension compared to existing AI technologies. #### Summary The AI-driven tool effectively simplified medical texts to a 6th-grade reading level, enhancing understanding and engagement among patients with low health literacy. ## Model Examination The model's outputs were reviewed by healthcare professionals to ensure accuracy and completeness. ## Environmental Impact Carbon emissions can be estimated using the [Machine Learning Impact calculator](https://mlco2.github.io/impact#compute) presented in [Lacoste et al. (2019)](https://arxiv.org/abs/1910.09700). - **Hardware Type:** GPU (NVIDIA A100) - **Hours used:** 120 hours - **Cloud Provider:** AWS - **Compute Region:** US West (Utah) - **Carbon Emitted:** 500 kg CO2eq ## Technical Specifications [optional] ### Model Architecture and Objective The model is based on a sequence-to-sequence transformer architecture fine-tuned for text simplification. ### Compute Infrastructure #### Hardware Training was conducted on NVIDIA A100 GPUs. #### Software The model was developed on Google Colab using Python and Hugging Face's Transformers library. ## Glossary Health Literacy: The ability to obtain, process, and understand basic health information to make appropriate health decisions. Readability Assessments: Tools used to evaluate the reading level of a text, such as SMOG, Flesch-Kincaid, and Gunning Fog. ## More Information For further details and inquiries, please contact the model author. ## Model Card Authors Clark Parry ## Model Card Contact Visit [website] for business inquiries. Contact author for model inquiries.