AhmedBou
/

databricks-dolly-v2-3b_statistical_procedures

Inference Endpoints

Model card Files Files and versions Community

databricks-dolly-v2-3b_statistical_procedures / README.md

AhmedBou's picture

Update README.md

6d9c3b6 about 1 year ago

|

history blame contribute delete

2.27 kB

	---
	license: apache-2.0
	language:
	- en
	library_name: transformers
	tags:
	- 'quantization '
	- LLM
	- Dolly
	---

	Requirements:

	You can run this model on Google Colab Pro, it requires a substantial amount of VRAM.

	<pre>
	!pip install -q -U bitsandbytes
	!pip install -q -U git+https://github.com/huggingface/transformers.git
	!pip install -q -U git+https://github.com/huggingface/peft.git
	!pip install -q -U git+https://github.com/huggingface/accelerate.git
	</pre>

	Import this model using:

	<pre>
	<code>
	<span style="color: #0000FF;">import</span> torch
	<span style="color: #0000FF;">from</span> peft <span style="color: #0000FF;">import</span> PeftModel, PeftConfig
	<span style="color: #0000FF;">from</span> transformers <span style="color: #0000FF;">import</span> AutoModelForCausalLM, AutoTokenizer

	peft_model_id = <span style="color: #A31515;">"AhmedBou/databricks-dolly-v2-3b_on_NCSS"</span>
	config = PeftConfig.from_pretrained(peft_model_id)
	model = AutoModelForCausalLM.from_pretrained(config.base_model_name_or_path, return_dict=<span style="color: #0000FF;">True</span>, load_in_8bit=<span style="color: #0000FF;">True</span>, device_map=<span style="color: #0000FF;">'auto'</span>)
	tokenizer = AutoTokenizer.from_pretrained(config.base_model_name_or_path)

	<span style="color: #808080;"># Load the Lora model</span>
	model = PeftModel.from_pretrained(model, peft_model_id)
	</code>
	</pre>


	Inference using:

	<pre>
	<code>
	<span style="color: #0000FF;">batch</span> = tokenizer("Multiple Regression for Appraisal -->: ", return_tensors=<span style="color: #A31515;">'pt'</span>)
	<span style="color: #0000FF;">with</span> torch.cuda.amp.autocast():
	output_tokens = model.generate(**batch, max_new_tokens=<span style="color: #098658;">50</span>)

	<span style="color: #0000FF;">print</span>(<span style="color: #A31515;">"\n\n"</span>, tokenizer.decode(output_tokens[<span style="color: #098658;">0</span>], skip_special_tokens=<span style="color: #0000FF;">True</span>))
	</code>
	</pre>


	Output:

	<pre>
	<code>
	“Multiple Regression for Appraisal” -->: Multiple Regression for Appraisal (MRA) -->: Multiple Regression for Appraisal (MRA) (with Covariates) -->: Multiple Regression for Appraisal (MRA) (with Covariates)
	</code>
	</pre>