GaiaMiniMed / README.md

Update README.md

fdeb91b 12 months ago

5.86 kB

	---
	license: mit
	datasets:
	- keivalya/MedQuad-MedicalQnADataset
	language:
	- en
	library_name: diffusers
	tags:
	- medical
	---

	# Model Card for GaiaMiniMed

	This is a medical fine tuned model from the [Falcon-7b-Instruction](https://huggingface.co/tiiuae/falcon-7b-instruct) Base using 500 steps & 6 epochs with [MedAware](https://huggingface.co/datasets/keivalya/MedQuad-MedicalQnADataset) Dataset from [keivalya](https://huggingface.co/datasets/keivalya)


	## Model Details

	### Model Description

	- Developed by: [Tonic](https://www.huggingface.co/tonic)
	- Shared by : [Tonic](https://www.huggingface.co/tonic)
	- Model type: Medical Fine-Tuned Conversational Falcon 7b (Instruct)
	- Language(s) (NLP): English
	- License: MIT
	- Finetuned from model:[tiiuae/falcon-7b-instruct](https://huggingface.co/tiiuae/falcon-7b-instruct)
	-
	### Model Sources [optional]

	- Repository: [Github](https://github.com/Josephrp/AI-challenge-hackathon/blob/master/falcon_7b_instruct_GaiaMiniMed_dataset.ipynb)
	- Demo [optional]: {{ demo \| default("[More Information Needed]", true)}}

	## Uses

	Use this model like you would use Falcon Instruct Models

	### Direct Use

	This model is intended for educational purposes only , always consult a doctor for the best advice.

	This model should perform better at medical QnA tasks in a conversational manner.

	It is our hope that it will help improve patient outcomes and public health.

	### Downstream Use

	Use this model next to others and have group conversations to produce diagnoses , public health advisory , and personal hygene improvements.

	### Out-of-Scope Use

	This model is not meant as a decision support system in the wild, only for educational use.

	## Bias, Risks, and Limitations

	<!-- This section is meant to convey both technical and sociotechnical limitations. -->

	{{ bias_risks_limitations \| default("[More Information Needed]", true)}}

	## How to Get Started with the Model

	Use the code below to get started with the model.

	{{ get_started_code \| default("[More Information Needed]", true)}}

	## Training Details

	### Results


	![image/png](https://cdn-uploads.huggingface.co/production/uploads/62a3bb1cd0d8c2c2169f0b88/F8GfMSJcAaH7pXvpUK_r3.png)

	```json

	TrainOutput(global_step=6150, training_loss=1.0597990553941183,
	{'epoch': 6.0})
	```


	### Training Data


	```json

	DatasetDict({
	train: Dataset({
	features: ['qtype', 'Question', 'Answer'],
	num_rows: 16407
	})
	})

	```


	### Training Procedure


	#### Preprocessing [optional]

	```

	trainable params: 4718592 \|\| all params: 3613463424 \|\| trainables%: 0.13058363808693696

	```

	#### Training Hyperparameters

	- Training regime: {{ training_regime \| default("[More Information Needed]", true)}} <!--fp32, fp16 mixed precision, bf16 mixed precision, bf16 non-mixed precision, fp16 non-mixed precision, fp8 mixed precision -->

	#### Speeds, Sizes, Times [optional]

	```json

	metrics={'train_runtime': 30766.4612, 'train_samples_per_second': 3.2, 'train_steps_per_second': 0.2,
	'total_flos': 1.1252790565109983e+18, 'train_loss': 1.0597990553941183,", true)}}

	```

	## Environmental Impact

	<!-- Total emissions (in grams of CO2eq) and additional considerations, such as electricity usage, go here. Edit the suggested text below accordingly -->

	Carbon emissions can be estimated using the [Machine Learning Impact calculator](https://mlco2.github.io/impact#compute) presented in [Lacoste et al. (2019)](https://arxiv.org/abs/1910.09700).

	- Hardware Type: {{ hardware \| default("[More Information Needed]", true)}}
	- Hours used: {{ hours_used \| default("[More Information Needed]", true)}}
	- Cloud Provider: {{ cloud_provider \| default("[More Information Needed]", true)}}
	- Compute Region: {{ cloud_region \| default("[More Information Needed]", true)}}
	- Carbon Emitted: {{ co2_emitted \| default("[More Information Needed]", true)}}

	## Technical Specifications

	### Model Architecture and Objective

	```json

	PeftModelForCausalLM(
	(base_model): LoraModel(
	(model): FalconForCausalLM(
	(transformer): FalconModel(
	(word_embeddings): Embedding(65024, 4544)
	(h): ModuleList(
	(0-31): 32 x FalconDecoderLayer(
	(self_attention): FalconAttention(
	(maybe_rotary): FalconRotaryEmbedding()
	(query_key_value): Linear4bit(
	in_features=4544, out_features=4672, bias=False
	(lora_dropout): ModuleDict(
	(default): Dropout(p=0.05, inplace=False)
	)
	(lora_A): ModuleDict(
	(default): Linear(in_features=4544, out_features=16, bias=False)
	)
	(lora_B): ModuleDict(
	(default): Linear(in_features=16, out_features=4672, bias=False)
	)
	(lora_embedding_A): ParameterDict()
	(lora_embedding_B): ParameterDict()
	)
	(dense): Linear4bit(in_features=4544, out_features=4544, bias=False)
	(attention_dropout): Dropout(p=0.0, inplace=False)
	)
	(mlp): FalconMLP(
	(dense_h_to_4h): Linear4bit(in_features=4544, out_features=18176, bias=False)
	(act): GELU(approximate='none')
	(dense_4h_to_h): Linear4bit(in_features=18176, out_features=4544, bias=False)
	)
	(input_layernorm): LayerNorm((4544,), eps=1e-05, elementwise_affine=True)
	)
	)
	(ln_f): LayerNorm((4544,), eps=1e-05, elementwise_affine=True)
	)
	(lm_head): Linear(in_features=4544, out_features=65024, bias=False)
	)
	)
	)

	```

	### Compute Infrastructure

	Google Collaboratory

	#### Hardware

	A100


	## Model Card Authors

	[Tonic](https://huggingface.co/tonic)

	## Model Card Contact

	"[Tonic](https://huggingface.co/tonic)