GPT-2-LoRA-HealthCare

Original Author: BastienHot ZhanPascal
Date created: 2024/02/25
Dataset:

Spaces Demo

Description : GPT-2-LoRA-HealthCare

The GPT-2-LoRA-HealthCare model was developed as part of a student project during the Bachelor of Technology (BUT) in Computer Science at IUT Villetaneuse. It is based on the pre-trained model from keras_nlp (gpt2-large-en), with the incorporation of the LoRA technique to enhance training efficiency. The model is specifically designed for Q&A interactions in a healthcare context, with a patient asking a question and the model responding with an appropriate answer.

LoRA is a training technique used for large language models (LLM) to train them more efficiently and with a less time consuming approach. How it works is the following:

  • Look at the pretrained model's weights
  • Determine the linearly independent and dependent columns of the matrix
  • Create the new LoRA layer with only the linearly independent matrix columns
  • Freeze the pretrained model's weights
  • Train the LoRA layer weights
  • Merge the weights of the pretrained model and the LoRA layer
Downloads last month
23
Inference Examples
Inference API (serverless) does not yet support keras models for this pipeline type.

Spaces using DracolIA/GPT-2-LoRA-HealthCare 3