ICD-10 Code Predictor

A fine-tuned language model that predicts ICD-10 diagnosis codes from clinical text descriptions.

Model Details

Model Description

This model takes a plain English description of a patient's symptoms or condition and outputs the corresponding ICD-10 diagnosis code. It is built on Meta's Llama 3.2 3B base model, fine-tuned using LoRA (Low-Rank Adaptation) with the Unsloth library for efficient training.

  • Developed by: StudioIlios
  • Model type: Causal Language Model (LoRA fine-tuned)
  • Language(s): English (clinical/medical text)
  • Base Model: meta-llama/Llama-3.2-3B
  • Fine-tuning method: LoRA via Unsloth
  • License: [More Information Needed]

Uses

Direct Use

Input a clinical description of a patient's condition and the model will return the predicted ICD-10 code.

Example prompt:

Patient has diabetes mellitus with high blood sugar. What is the ICD10 code?

Example output:

The ICD10 code for Diabetes mellitus is E11.9

Downstream Use

  • Medical billing automation
  • Insurance claim processing
  • EHR (Electronic Health Record) systems
  • Healthcare apps requiring automatic diagnosis code suggestion

Out-of-Scope Use

  • This model should not be used as a substitute for professional medical diagnosis
  • Not suitable for rare or highly complex conditions without human verification
  • Not intended for real-time critical care decisions

Bias, Risks, and Limitations

  • Model predictions should always be verified by a qualified medical coder or physician
  • May not accurately predict codes for uncommon or highly specific conditions
  • Performance depends on how clearly the condition is described in the input

Recommendations

Always have a medical professional review the predicted ICD-10 codes before using them for billing or insurance purposes.

How to Get Started with the Model

from transformers import AutoTokenizer, AutoModelForCausalLM
from peft import PeftModel

base_model = AutoModelForCausalLM.from_pretrained("meta-llama/Llama-3.2-3B")
tokenizer = AutoTokenizer.from_pretrained("StudioIlios/icd10-model")
model = PeftModel.from_pretrained(base_model, "StudioIlios/icd10-model")

prompt = """
Patient has diabetes mellitus with high blood sugar.
What is the ICD10 code?
"""

inputs = tokenizer(prompt, return_tensors="pt")
outputs = model.generate(**inputs, max_new_tokens=100, temperature=0.7, do_sample=True)
print(tokenizer.decode(outputs[0], skip_special_tokens=True))

Training Details

Training Data

Fine-tuned on medical clinical text paired with ICD-10 diagnosis codes.

Training Procedure

Training Hyperparameters

  • Training regime: LoRA fine-tuning with bf16 mixed precision
  • Library: Unsloth
  • Base model: Llama 3.2 3B

Evaluation

Results

The model correctly predicts common ICD-10 codes from plain English clinical descriptions.

Sample tested:

Input Predicted Code
Diabetes mellitus with high blood sugar E11.9

Technical Specifications

Model Architecture

  • Base: Llama 3.2 3B (causal language model)
  • Adapter: LoRA (Low-Rank Adaptation)
  • Files: adapter_config.json, adapter_model.safetensors

Hardware Used for Training

  • GPU: NVIDIA Tesla T4 (Google Colab)
Downloads last month

-

Downloads are not tracked for this model. How to track
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Space using StudioIlios/icd10-model 1