tgaddair
/

mistral-7b-gsmk8k-lora-r8

Model card Files Files and versions Community

mistral-7b-gsmk8k-lora-r8 / README.md

tgaddair's picture

Upload 3 files

1d86083 verified 6 months ago

|

history blame contribute delete

No virus

1.64 kB

	---
	library_name: peft
	base_model: mistralai/Mistral-7B-v0.1
	datasets:
	- gsm8k
	---

	# Model Card for Model ID

	Trained with [Ludwig.ai](https://ludwig.ai) and [Predibase](https://predibase.com)!

	Given a grade school math question, provide the answer including reasoning steps.

	Try it in [LoRAX](https://github.com/predibase/lorax):

	```python
	from lorax import Client

	client = Client("http://<your_endpoint>")

	question = "<your math question>"

	prompt = f"""
	Please answer the following question: {question}

	Answer:
	"""

	adapter_id = "tgaddair/mistral-7b-gsmk8k-lora-r8"
	resp = client.generate(prompt, max_new_tokens=64, adapter_id=adapter_id)
	print(resp.generated_text)
	```



	## Model Details

	### Model Description

	Ludwig config (v0.9.3):

	```yaml
	model_type: llm
	input_features:
	- name: prompt
	type: text
	preprocessing:
	max_sequence_length: null
	column: prompt
	output_features:
	- name: answer
	type: text
	preprocessing:
	max_sequence_length: null
	column: answer
	prompt:
	template: \|-
	Please answer the following question: {question}

	Answer:
	preprocessing:
	split:
	type: fixed
	column: split
	global_max_sequence_length: 2048
	adapter:
	type: lora
	generation:
	max_new_tokens: 64
	trainer:
	type: finetune
	epochs: 3
	optimizer:
	type: paged_adam
	batch_size: 1
	eval_steps: 100
	learning_rate: 0.0002
	eval_batch_size: 2
	steps_per_checkpoint: 1000
	learning_rate_scheduler:
	decay: cosine
	warmup_fraction: 0.03
	gradient_accumulation_steps: 16
	enable_gradient_checkpointing: true
	base_model: mistralai/Mistral-7B-v0.1
	quantization:
	bits: 4
	```