tinybiggames
/

Phi-3-mini-4k-instruct-Q4_K_M-GGUF

Text Generation

Model card Files Files and versions Community

Phi-3-mini-4k-instruct-Q4_K_M-GGUF / README.md

tinybiggames's picture

Update README.md

834331c verified about 2 months ago

|

history blame contribute delete

No virus

2.05 kB

	---
	language:
	- en
	license: mit
	tags:
	- nlp
	- code
	- llama-cpp
	- gguf-my-repo
	- LMEngine
	license_link: https://huggingface.co/microsoft/Phi-3-mini-4k-instruct/resolve/main/LICENSE
	pipeline_tag: text-generation
	inference:
	parameters:
	temperature: 0
	widget:
	- messages:
	- role: user
	content: Can you provide ways to eat combinations of bananas and dragonfruits?
	---

	# tinybiggames/Phi-3-mini-4k-instruct-Q4_K_M-GGUF
	This model was converted to GGUF format from [`microsoft/Phi-3-mini-4k-instruct`](https://huggingface.co/microsoft/Phi-3-mini-4k-instruct) using llama.cpp via the ggml.ai's [GGUF-my-repo](https://huggingface.co/spaces/ggml-org/gguf-my-repo) space.
	Refer to the [original model card](https://huggingface.co/microsoft/Phi-3-mini-4k-instruct) for more details on the model.
	## Use with tinyBigGAMES's [Inference](https://github.com/tinyBigGAMES) Libraries.


	How to configure LMEngine:

	```Delphi
	InitConfig(
	'C:/LLM/gguf', // path to model files
	-1 // number of GPU layer, -1 to use all available layers
	);
	```

	How to define model:

	```Delphi
	DefineModel('phi-3-mini-4k-instruct.Q4_K_M.gguf',
	'phi-3-mini-4k-instruct.Q4_K_M', 4000,
	'<\|{role}\|>{content}<\|end\|>',
	'<\|assistant\|>');
	```

	How to add a message:

	```Delphi
	AddMessage(
	ROLE_USER, // role
	'What is AI?' // content
	);
	```

	`{role}` - will be substituted with the message "role"
	`{content}` - will be substituted with the message "content"

	How to do inference:

	```Delphi
	var
	LTokenOutputSpeed: Single;
	LInputTokens: Int32;
	LOutputTokens: Int32;
	LTotalTokens: Int32;

	if RunInference('phi-3-mini-4k-instruct.Q4_K_M', 1024) then
	begin
	GetInferenceStats(nil, @LTokenOutputSpeed, @LInputTokens, @LOutputTokens,
	@LTotalTokens);
	PrintLn('', FG_WHITE);
	PrintLn('Tokens :: Input: %d, Output: %d, Total: %d, Speed: %3.1f t/s',
	FG_BRIGHTYELLOW, LInputTokens, LOutputTokens, LTotalTokens, LTokenOutputSpeed);
	end
	else
	begin
	PrintLn('', FG_WHITE);
	PrintLn('Error: %s', FG_RED, GetError());
	end;
	```