Instructions to use MebinThattil/tiny-llama-q4_0 with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- llama-cpp-python
How to use MebinThattil/tiny-llama-q4_0 with llama-cpp-python:
# !pip install llama-cpp-python from llama_cpp import Llama llm = Llama.from_pretrained( repo_id="MebinThattil/tiny-llama-q4_0", filename="tinyllama-1.1B-q4.gguf", )
llm.create_chat_completion( messages = "No input example has been defined for this model task." )
- Notebooks
- Google Colab
- Kaggle
- Local Apps
- llama.cpp
How to use MebinThattil/tiny-llama-q4_0 with llama.cpp:
Install from brew
brew install llama.cpp # Start a local OpenAI-compatible server with a web UI: llama-server -hf MebinThattil/tiny-llama-q4_0 # Run inference directly in the terminal: llama-cli -hf MebinThattil/tiny-llama-q4_0
Install from WinGet (Windows)
winget install llama.cpp # Start a local OpenAI-compatible server with a web UI: llama-server -hf MebinThattil/tiny-llama-q4_0 # Run inference directly in the terminal: llama-cli -hf MebinThattil/tiny-llama-q4_0
Use pre-built binary
# Download pre-built binary from: # https://github.com/ggerganov/llama.cpp/releases # Start a local OpenAI-compatible server with a web UI: ./llama-server -hf MebinThattil/tiny-llama-q4_0 # Run inference directly in the terminal: ./llama-cli -hf MebinThattil/tiny-llama-q4_0
Build from source code
git clone https://github.com/ggerganov/llama.cpp.git cd llama.cpp cmake -B build cmake --build build -j --target llama-server llama-cli # Start a local OpenAI-compatible server with a web UI: ./build/bin/llama-server -hf MebinThattil/tiny-llama-q4_0 # Run inference directly in the terminal: ./build/bin/llama-cli -hf MebinThattil/tiny-llama-q4_0
Use Docker
docker model run hf.co/MebinThattil/tiny-llama-q4_0
- LM Studio
- Jan
- Ollama
How to use MebinThattil/tiny-llama-q4_0 with Ollama:
ollama run hf.co/MebinThattil/tiny-llama-q4_0
- Unsloth Studio new
How to use MebinThattil/tiny-llama-q4_0 with Unsloth Studio:
Install Unsloth Studio (macOS, Linux, WSL)
curl -fsSL https://unsloth.ai/install.sh | sh # Run unsloth studio unsloth studio -H 0.0.0.0 -p 8888 # Then open http://localhost:8888 in your browser # Search for MebinThattil/tiny-llama-q4_0 to start chatting
Install Unsloth Studio (Windows)
irm https://unsloth.ai/install.ps1 | iex # Run unsloth studio unsloth studio -H 0.0.0.0 -p 8888 # Then open http://localhost:8888 in your browser # Search for MebinThattil/tiny-llama-q4_0 to start chatting
Using HuggingFace Spaces for Unsloth
# No setup required # Open https://huggingface.co/spaces/unsloth/studio in your browser # Search for MebinThattil/tiny-llama-q4_0 to start chatting
- Docker Model Runner
How to use MebinThattil/tiny-llama-q4_0 with Docker Model Runner:
docker model run hf.co/MebinThattil/tiny-llama-q4_0
- Lemonade
How to use MebinThattil/tiny-llama-q4_0 with Lemonade:
Pull the model
# Download Lemonade from https://lemonade-server.ai/ lemonade pull MebinThattil/tiny-llama-q4_0
Run and chat with the model
lemonade run user.tiny-llama-q4_0-{{QUANT_TAG}}List all available models
lemonade list
| from __future__ import annotations | |
| import argparse | |
| from typing import List, Literal, Union, Any, Type, TypeVar | |
| from pydantic import BaseModel | |
| def _get_base_type(annotation: Type[Any]) -> Type[Any]: | |
| if getattr(annotation, "__origin__", None) is Literal: | |
| assert hasattr(annotation, "__args__") and len(annotation.__args__) >= 1 # type: ignore | |
| return type(annotation.__args__[0]) # type: ignore | |
| elif getattr(annotation, "__origin__", None) is Union: | |
| assert hasattr(annotation, "__args__") and len(annotation.__args__) >= 1 # type: ignore | |
| non_optional_args: List[Type[Any]] = [ | |
| arg for arg in annotation.__args__ if arg is not type(None) # type: ignore | |
| ] | |
| if non_optional_args: | |
| return _get_base_type(non_optional_args[0]) | |
| elif ( | |
| getattr(annotation, "__origin__", None) is list | |
| or getattr(annotation, "__origin__", None) is List | |
| ): | |
| assert hasattr(annotation, "__args__") and len(annotation.__args__) >= 1 # type: ignore | |
| return _get_base_type(annotation.__args__[0]) # type: ignore | |
| return annotation | |
| def _contains_list_type(annotation: Type[Any] | None) -> bool: | |
| origin = getattr(annotation, "__origin__", None) | |
| if origin is list or origin is List: | |
| return True | |
| elif origin in (Literal, Union): | |
| return any(_contains_list_type(arg) for arg in annotation.__args__) # type: ignore | |
| else: | |
| return False | |
| def _parse_bool_arg(arg: str | bytes | bool) -> bool: | |
| if isinstance(arg, bytes): | |
| arg = arg.decode("utf-8") | |
| true_values = {"1", "on", "t", "true", "y", "yes"} | |
| false_values = {"0", "off", "f", "false", "n", "no"} | |
| arg_str = str(arg).lower().strip() | |
| if arg_str in true_values: | |
| return True | |
| elif arg_str in false_values: | |
| return False | |
| else: | |
| raise ValueError(f"Invalid boolean argument: {arg}") | |
| def add_args_from_model(parser: argparse.ArgumentParser, model: Type[BaseModel]): | |
| """Add arguments from a pydantic model to an argparse parser.""" | |
| for name, field in model.model_fields.items(): | |
| description = field.description | |
| if field.default and description and not field.is_required(): | |
| description += f" (default: {field.default})" | |
| base_type = ( | |
| _get_base_type(field.annotation) if field.annotation is not None else str | |
| ) | |
| list_type = _contains_list_type(field.annotation) | |
| if base_type is not bool: | |
| parser.add_argument( | |
| f"--{name}", | |
| dest=name, | |
| nargs="*" if list_type else None, | |
| type=base_type, | |
| help=description, | |
| ) | |
| if base_type is bool: | |
| parser.add_argument( | |
| f"--{name}", | |
| dest=name, | |
| type=_parse_bool_arg, | |
| help=f"{description}", | |
| ) | |
| T = TypeVar("T", bound=Type[BaseModel]) | |
| def parse_model_from_args(model: T, args: argparse.Namespace) -> T: | |
| """Parse a pydantic model from an argparse namespace.""" | |
| return model( | |
| **{ | |
| k: v | |
| for k, v in vars(args).items() | |
| if v is not None and k in model.model_fields | |
| } | |
| ) | |