metadata

license: mit
datasets:
  - keivalya/MedQuad-MedicalQnADataset
language:
  - en
library_name: peft
tags:
  - medical

Model Card for GaiaMiniMed

This is a medical fine tuned model from the Falcon-7b-Instruction Base using 500 steps & 6 epochs with MedAware Dataset from keivalya

Model Details

Model Description

Developed by: Tonic
Shared by : Tonic
Model type: Medical Fine-Tuned Conversational Falcon 7b (Instruct)
Language(s) (NLP): English
License: MIT
Finetuned from model:tiiuae/falcon-7b-instruct

Model Sources [optional]

Repository: Github
Demo [optional]: {{ demo | default("[More Information Needed]", true)}}

Uses

Use this model like you would use Falcon Instruct Models

Direct Use

This model is intended for educational purposes only , always consult a doctor for the best advice.

This model should perform better at medical QnA tasks in a conversational manner.

It is our hope that it will help improve patient outcomes and public health.

Downstream Use

Use this model next to others and have group conversations to produce diagnoses , public health advisory , and personal hygene improvements.

Out-of-Scope Use

This model is not meant as a decision support system in the wild, only for educational use.

Bias, Risks, and Limitations

{{ bias_risks_limitations | default("[More Information Needed]", true)}}

How to Get Started with the Model

Use the code below to get started with the model.

{{ get_started_code | default("[More Information Needed]", true)}}

Training Details

Results


TrainOutput(global_step=6150, training_loss=1.0597990553941183,
{'epoch': 6.0})

Training Data


DatasetDict({
    train: Dataset({
        features: ['qtype', 'Question', 'Answer'],
        num_rows: 16407
    })
})

Training Procedure

Preprocessing [optional]


trainable params: 4718592 || all params: 3613463424 || trainables%: 0.13058363808693696

Training Hyperparameters

Training regime: {{ training_regime | default("[More Information Needed]", true)}}

Speeds, Sizes, Times [optional]


metrics={'train_runtime': 30766.4612, 'train_samples_per_second': 3.2, 'train_steps_per_second': 0.2,
'total_flos': 1.1252790565109983e+18, 'train_loss': 1.0597990553941183,", true)}}

Environmental Impact

Carbon emissions can be estimated using the Machine Learning Impact calculator presented in Lacoste et al. (2019).

Hardware Type: {{ hardware | default("[More Information Needed]", true)}}
Hours used: {{ hours_used | default("[More Information Needed]", true)}}
Cloud Provider: {{ cloud_provider | default("[More Information Needed]", true)}}
Compute Region: {{ cloud_region | default("[More Information Needed]", true)}}
Carbon Emitted: {{ co2_emitted | default("[More Information Needed]", true)}}

Technical Specifications

Model Architecture and Objective


PeftModelForCausalLM(
  (base_model): LoraModel(
    (model): FalconForCausalLM(
      (transformer): FalconModel(
        (word_embeddings): Embedding(65024, 4544)
        (h): ModuleList(
          (0-31): 32 x FalconDecoderLayer(
            (self_attention): FalconAttention(
              (maybe_rotary): FalconRotaryEmbedding()
              (query_key_value): Linear4bit(
                in_features=4544, out_features=4672, bias=False
                (lora_dropout): ModuleDict(
                  (default): Dropout(p=0.05, inplace=False)
                )
                (lora_A): ModuleDict(
                  (default): Linear(in_features=4544, out_features=16, bias=False)
                )
                (lora_B): ModuleDict(
                  (default): Linear(in_features=16, out_features=4672, bias=False)
                )
                (lora_embedding_A): ParameterDict()
                (lora_embedding_B): ParameterDict()
              )
              (dense): Linear4bit(in_features=4544, out_features=4544, bias=False)
              (attention_dropout): Dropout(p=0.0, inplace=False)
            )
            (mlp): FalconMLP(
              (dense_h_to_4h): Linear4bit(in_features=4544, out_features=18176, bias=False)
              (act): GELU(approximate='none')
              (dense_4h_to_h): Linear4bit(in_features=18176, out_features=4544, bias=False)
            )
            (input_layernorm): LayerNorm((4544,), eps=1e-05, elementwise_affine=True)
          )
        )
        (ln_f): LayerNorm((4544,), eps=1e-05, elementwise_affine=True)
      )
      (lm_head): Linear(in_features=4544, out_features=65024, bias=False)
    )
  )
)

Compute Infrastructure

Google Collaboratory

Hardware

A100

Model Card Authors

Tonic

Model Card Contact

"Tonic