Instructions to use teclabs/llama-capro with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- llama-cpp-python
How to use teclabs/llama-capro with llama-cpp-python:
# !pip install llama-cpp-python from llama_cpp import Llama llm = Llama.from_pretrained( repo_id="teclabs/llama-capro", filename="llama-capro-f16.gguf", )
llm.create_chat_completion( messages = [ { "role": "user", "content": "What is the capital of France?" } ] ) - Notebooks
- Google Colab
- Kaggle
- Local Apps
- llama.cpp
How to use teclabs/llama-capro with llama.cpp:
Install from brew
brew install llama.cpp # Start a local OpenAI-compatible server with a web UI: llama-server -hf teclabs/llama-capro:Q4_K_M # Run inference directly in the terminal: llama-cli -hf teclabs/llama-capro:Q4_K_M
Install from WinGet (Windows)
winget install llama.cpp # Start a local OpenAI-compatible server with a web UI: llama-server -hf teclabs/llama-capro:Q4_K_M # Run inference directly in the terminal: llama-cli -hf teclabs/llama-capro:Q4_K_M
Use pre-built binary
# Download pre-built binary from: # https://github.com/ggerganov/llama.cpp/releases # Start a local OpenAI-compatible server with a web UI: ./llama-server -hf teclabs/llama-capro:Q4_K_M # Run inference directly in the terminal: ./llama-cli -hf teclabs/llama-capro:Q4_K_M
Build from source code
git clone https://github.com/ggerganov/llama.cpp.git cd llama.cpp cmake -B build cmake --build build -j --target llama-server llama-cli # Start a local OpenAI-compatible server with a web UI: ./build/bin/llama-server -hf teclabs/llama-capro:Q4_K_M # Run inference directly in the terminal: ./build/bin/llama-cli -hf teclabs/llama-capro:Q4_K_M
Use Docker
docker model run hf.co/teclabs/llama-capro:Q4_K_M
- LM Studio
- Jan
- vLLM
How to use teclabs/llama-capro with vLLM:
Install from pip and serve model
# Install vLLM from pip: pip install vllm # Start the vLLM server: vllm serve "teclabs/llama-capro" # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:8000/v1/chat/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "teclabs/llama-capro", "messages": [ { "role": "user", "content": "What is the capital of France?" } ] }'Use Docker
docker model run hf.co/teclabs/llama-capro:Q4_K_M
- Ollama
How to use teclabs/llama-capro with Ollama:
ollama run hf.co/teclabs/llama-capro:Q4_K_M
- Unsloth Studio new
How to use teclabs/llama-capro with Unsloth Studio:
Install Unsloth Studio (macOS, Linux, WSL)
curl -fsSL https://unsloth.ai/install.sh | sh # Run unsloth studio unsloth studio -H 0.0.0.0 -p 8888 # Then open http://localhost:8888 in your browser # Search for teclabs/llama-capro to start chatting
Install Unsloth Studio (Windows)
irm https://unsloth.ai/install.ps1 | iex # Run unsloth studio unsloth studio -H 0.0.0.0 -p 8888 # Then open http://localhost:8888 in your browser # Search for teclabs/llama-capro to start chatting
Using HuggingFace Spaces for Unsloth
# No setup required # Open https://huggingface.co/spaces/unsloth/studio in your browser # Search for teclabs/llama-capro to start chatting
- Pi new
How to use teclabs/llama-capro with Pi:
Start the llama.cpp server
# Install llama.cpp: brew install llama.cpp # Start a local OpenAI-compatible server: llama-server -hf teclabs/llama-capro:Q4_K_M
Configure the model in Pi
# Install Pi: npm install -g @mariozechner/pi-coding-agent # Add to ~/.pi/agent/models.json: { "providers": { "llama-cpp": { "baseUrl": "http://localhost:8080/v1", "api": "openai-completions", "apiKey": "none", "models": [ { "id": "teclabs/llama-capro:Q4_K_M" } ] } } }Run Pi
# Start Pi in your project directory: pi
- Hermes Agent new
How to use teclabs/llama-capro with Hermes Agent:
Start the llama.cpp server
# Install llama.cpp: brew install llama.cpp # Start a local OpenAI-compatible server: llama-server -hf teclabs/llama-capro:Q4_K_M
Configure Hermes
# Install Hermes: curl -fsSL https://hermes-agent.nousresearch.com/install.sh | bash hermes setup # Point Hermes at the local server: hermes config set model.provider custom hermes config set model.base_url http://127.0.0.1:8080/v1 hermes config set model.default teclabs/llama-capro:Q4_K_M
Run Hermes
hermes
- Docker Model Runner
How to use teclabs/llama-capro with Docker Model Runner:
docker model run hf.co/teclabs/llama-capro:Q4_K_M
- Lemonade
How to use teclabs/llama-capro with Lemonade:
Pull the model
# Download Lemonade from https://lemonade-server.ai/ lemonade pull teclabs/llama-capro:Q4_K_M
Run and chat with the model
lemonade run user.llama-capro-Q4_K_M
List all available models
lemonade list
license: llama3.1
base_model: meta-llama/Llama-3.1-8B-Instruct
tags:
- llama
- llama-3
- gguf
- quantized
- indian-accounting
- ind-as
- finance
- accounting
- ca
language:
- en
pipeline_tag: text-generation
Llama-CAPRO (CA Professional) - Indian Accounting Standards Expert
CAPRO is a specialized AI assistant for Chartered Accountants and finance professionals, fine-tuned on Indian Accounting Standards (Ind AS).
Model Details
- Base Model: meta-llama/Llama-3.1-8B-Instruct
- Fine-tuned Adapter: 0xadityam/llama-aica
- Domain: Indian Accounting Standards (Ind AS)
- Format: GGUF (Ollama, LM Studio, llama.cpp ready)
- Training: LoRA fine-tuned on 269 Ind AS examples
Available Quantizations
| Format | Size | Use Case | RAM Required |
|---|---|---|---|
| F16 | ~15GB | Maximum quality | 20GB+ |
| Q5_K_M | ~5.5GB | Good quality, laptops | 8-12GB |
| Q4_K_M | ~4.5GB | Balanced, edge devices | 6-8GB |
Quick Start with Ollama
Download Q4_K_M (recommended) wget https://huggingface.co/teclabs/llama-capro/resolve/main/llama-capro-q4_k_m.gguf
Create model ollama create llama-capro -f Modelfile-q4-k-m
Run ollama run llama-capro
Sample Queries
- "What is the objective of Ind AS 1?"
- "Explain revenue recognition under Ind AS 115"
- "What are disclosure requirements for financial instruments?"
- "How should goodwill be accounted for under Ind AS?"
- "What is the difference between Ind AS and IFRS?"
Modelfile (Ollama)
FROM llama-capro-q4_k_m.gguf
PARAMETER temperature 0.7 PARAMETER top_p 0.9 PARAMETER stop "<|eot_id|>" PARAMETER num_ctx 2048
SYSTEM "You are CAPRO (CA Professional), an expert on Indian Accounting Standards (Ind AS). Provide accurate answers about Ind AS regulations and accounting policies."
Use with LM Studio
- Download LM Studio from lmstudio.ai
- Click "Import" → "Import GGUF"
- Select the downloaded .gguf file
- Start chatting
Training Details
- LoRA Rank: 64
- LoRA Alpha: 16
- Training Epochs: 3
- Learning Rate: 2e-4
- Max Sequence Length: 2048
- Training Data: 269 examples covering Ind AS 1 and related standards
Limitations
- Specialized for Indian Accounting Standards
- Not suitable for other accounting standards (IFRS, US GAAP)
- Responses should be verified with official documents
- Not for legal or investment advice
License
Llama 3.1 Community License
Acknowledgements
- Base model: meta-llama/Llama-3.1-8B-Instruct
- LoRA adapter: 0xadityam/llama-aica
- Conversion: llama.cpp
CAPRO - CA Professional AI Assistant
Generated: 2025-11-14