llmware
/

slim-summary-tool

Inference Endpoints

Model card Files Files and versions Community

doberst commited on Mar 19

Commit

79f9165

•

1 Parent(s): b189f49

Update README.md

Files changed (1) hide show

README.md +3 -2

README.md CHANGED Viewed

@@ -7,10 +7,11 @@ license: cc-by-sa-4.0
 <!-- Provide a quick summary of what the model is/does. -->
-**slim-summary-tool** is a 4_K_M quantized GGUF version of slim-sa-ner-3b, providing a small, fast inference implementation, optimized for multi-model concurrent deployment.
-The size of the self-contained GGUF model binary is 1.71 GB, which is small enough to run locally on a CPU, and yet which comparables favorably with the use of two traditional FP32 versions of Roberta-Large for NER (1.42GB) and BERT for Sentiment Analysis (440 MB), while offering greater potential capacity depth with 2.7B parameters, and without the requirement of Pytorch and other external dependencies.
 [**slim-summary**](https://huggingface.co/llmware/slim-summary) is part of the SLIM ("**S**tructured **L**anguage **I**nstruction **M**odel") series, providing a set of small, specialized decoder-based LLMs, fine-tuned for function-calling.

 <!-- Provide a quick summary of what the model is/does. -->
+**slim-summary-tool** is a 4_K_M quantized GGUF version of slim-summary, providing a small, fast inference implementation, optimized for multi-model concurrent deployment.
+The size of the self-contained GGUF model binary is 1.71 GB, which is small enough to run locally on a CPU with reasonable inference speed.
+The model takes as input a text passage, an optional parameter with a focusing phrase or query, and an experimental optional (N) parameter, which is used to guide the model to a specific number of items return in a summary list.
 [**slim-summary**](https://huggingface.co/llmware/slim-summary) is part of the SLIM ("**S**tructured **L**anguage **I**nstruction **M**odel") series, providing a set of small, specialized decoder-based LLMs, fine-tuned for function-calling.