Instructions to use RichardErkhov/danielgombas_-_llama_1b_step2_batch_v2-gguf with libraries, inference providers, notebooks, and local apps. Follow these links to get started.

Libraries

How to use RichardErkhov/danielgombas_-_llama_1b_step2_batch_v2-gguf with llama-cpp-python:

# !pip install llama-cpp-python

from llama_cpp import Llama

llm = Llama.from_pretrained(
	repo_id="RichardErkhov/danielgombas_-_llama_1b_step2_batch_v2-gguf",
	filename="llama_1b_step2_batch_v2.IQ3_M.gguf",
)

llm.create_chat_completion(
	messages = "No input example has been defined for this model task."
)

Notebooks
Google Colab
Kaggle
Local Apps Settings

llama.cpp

How to use RichardErkhov/danielgombas_-_llama_1b_step2_batch_v2-gguf with llama.cpp:

Install (macOS, Linux)

curl -LsSf https://llama.app/install.sh | sh
# Start a local OpenAI-compatible server with a web UI:
llama serve -hf RichardErkhov/danielgombas_-_llama_1b_step2_batch_v2-gguf:Q4_K_M
# Run inference directly in the terminal:
llama cli -hf RichardErkhov/danielgombas_-_llama_1b_step2_batch_v2-gguf:Q4_K_M

Install from WinGet (Windows)

winget install llama.cpp
# Start a local OpenAI-compatible server with a web UI:
llama serve -hf RichardErkhov/danielgombas_-_llama_1b_step2_batch_v2-gguf:Q4_K_M
# Run inference directly in the terminal:
llama cli -hf RichardErkhov/danielgombas_-_llama_1b_step2_batch_v2-gguf:Q4_K_M

Use pre-built binary

# Download pre-built binary from:
# https://github.com/ggerganov/llama.cpp/releases
# Start a local OpenAI-compatible server with a web UI:
./llama-server -hf RichardErkhov/danielgombas_-_llama_1b_step2_batch_v2-gguf:Q4_K_M
# Run inference directly in the terminal:
./llama-cli -hf RichardErkhov/danielgombas_-_llama_1b_step2_batch_v2-gguf:Q4_K_M

Build from source code

git clone https://github.com/ggerganov/llama.cpp.git
cd llama.cpp
cmake -B build
cmake --build build -j --target llama-server llama-cli
# Start a local OpenAI-compatible server with a web UI:
./build/bin/llama-server -hf RichardErkhov/danielgombas_-_llama_1b_step2_batch_v2-gguf:Q4_K_M
# Run inference directly in the terminal:
./build/bin/llama-cli -hf RichardErkhov/danielgombas_-_llama_1b_step2_batch_v2-gguf:Q4_K_M

Use Docker

docker model run hf.co/RichardErkhov/danielgombas_-_llama_1b_step2_batch_v2-gguf:Q4_K_M

LM Studio
Jan
Ollama
How to use RichardErkhov/danielgombas_-_llama_1b_step2_batch_v2-gguf with Ollama:
```
ollama run hf.co/RichardErkhov/danielgombas_-_llama_1b_step2_batch_v2-gguf:Q4_K_M
```

Unsloth Studio

How to use RichardErkhov/danielgombas_-_llama_1b_step2_batch_v2-gguf with Unsloth Studio:

Install Unsloth Studio (macOS, Linux, WSL)

curl -fsSL https://unsloth.ai/install.sh | sh
# Run unsloth studio
unsloth studio -H 0.0.0.0 -p 8888
# Then open http://localhost:8888 in your browser
# Search for RichardErkhov/danielgombas_-_llama_1b_step2_batch_v2-gguf to start chatting

Install Unsloth Studio (Windows)

irm https://unsloth.ai/install.ps1 | iex
# Run unsloth studio
unsloth studio -H 0.0.0.0 -p 8888
# Then open http://localhost:8888 in your browser
# Search for RichardErkhov/danielgombas_-_llama_1b_step2_batch_v2-gguf to start chatting

Using HuggingFace Spaces for Unsloth

# No setup required
# Open https://huggingface.co/spaces/unsloth/studio in your browser
# Search for RichardErkhov/danielgombas_-_llama_1b_step2_batch_v2-gguf to start chatting

How to use RichardErkhov/danielgombas_-_llama_1b_step2_batch_v2-gguf with Pi:

Start the llama.cpp server

# Install llama.cpp:
brew install llama.cpp
# Start a local OpenAI-compatible server:
llama serve -hf RichardErkhov/danielgombas_-_llama_1b_step2_batch_v2-gguf:Q4_K_M

Configure the model in Pi

# Install Pi:
npm install -g @mariozechner/pi-coding-agent
# Add to ~/.pi/agent/models.json:
{
  "providers": {
    "llama-cpp": {
      "baseUrl": "http://localhost:8080/v1",
      "api": "openai-completions",
      "apiKey": "none",
      "models": [
        {
          "id": "RichardErkhov/danielgombas_-_llama_1b_step2_batch_v2-gguf:Q4_K_M"
        }
      ]
    }
  }
}

Run Pi

# Start Pi in your project directory:
pi

Hermes Agent new

How to use RichardErkhov/danielgombas_-_llama_1b_step2_batch_v2-gguf with Hermes Agent:

Start the llama.cpp server

# Install llama.cpp:
brew install llama.cpp
# Start a local OpenAI-compatible server:
llama serve -hf RichardErkhov/danielgombas_-_llama_1b_step2_batch_v2-gguf:Q4_K_M

Configure Hermes

# Install Hermes:
curl -fsSL https://hermes-agent.nousresearch.com/install.sh | bash
hermes setup
# Point Hermes at the local server:
hermes config set model.provider custom
hermes config set model.base_url http://127.0.0.1:8080/v1
hermes config set model.default RichardErkhov/danielgombas_-_llama_1b_step2_batch_v2-gguf:Q4_K_M

Run Hermes

hermes

Atomic Chat new
Docker Model Runner
How to use RichardErkhov/danielgombas_-_llama_1b_step2_batch_v2-gguf with Docker Model Runner:
```
docker model run hf.co/RichardErkhov/danielgombas_-_llama_1b_step2_batch_v2-gguf:Q4_K_M
```

Lemonade

How to use RichardErkhov/danielgombas_-_llama_1b_step2_batch_v2-gguf with Lemonade:

Pull the model

# Download Lemonade from https://lemonade-server.ai/
lemonade pull RichardErkhov/danielgombas_-_llama_1b_step2_batch_v2-gguf:Q4_K_M

Run and chat with the model

lemonade run user.danielgombas_-_llama_1b_step2_batch_v2-gguf-Q4_K_M

List all available models

lemonade list

YAML Metadata Warning:empty or missing yaml metadata in repo card

Check out the documentation for more information.

Quantization made by Richard Erkhov.

Github

Discord

Request more models

llama_1b_step2_batch_v2 - GGUF

Model creator: https://huggingface.co/danielgombas/
Original model: https://huggingface.co/danielgombas/llama_1b_step2_batch_v2/

Name	Quant method	Size
llama_1b_step2_batch_v2.Q2_K.gguf	Q2_K	0.54GB
llama_1b_step2_batch_v2.IQ3_XS.gguf	IQ3_XS	0.58GB
llama_1b_step2_batch_v2.IQ3_S.gguf	IQ3_S	0.6GB
llama_1b_step2_batch_v2.Q3_K_S.gguf	Q3_K_S	0.6GB
llama_1b_step2_batch_v2.IQ3_M.gguf	IQ3_M	0.61GB
llama_1b_step2_batch_v2.Q3_K.gguf	Q3_K	0.64GB
llama_1b_step2_batch_v2.Q3_K_M.gguf	Q3_K_M	0.64GB
llama_1b_step2_batch_v2.Q3_K_L.gguf	Q3_K_L	0.68GB
llama_1b_step2_batch_v2.IQ4_XS.gguf	IQ4_XS	0.7GB
llama_1b_step2_batch_v2.Q4_0.gguf	Q4_0	0.72GB
llama_1b_step2_batch_v2.IQ4_NL.gguf	IQ4_NL	0.72GB
llama_1b_step2_batch_v2.Q4_K_S.gguf	Q4_K_S	0.72GB
llama_1b_step2_batch_v2.Q4_K.gguf	Q4_K	0.75GB
llama_1b_step2_batch_v2.Q4_K_M.gguf	Q4_K_M	0.75GB
llama_1b_step2_batch_v2.Q4_1.gguf	Q4_1	0.77GB
llama_1b_step2_batch_v2.Q5_0.gguf	Q5_0	0.83GB
llama_1b_step2_batch_v2.Q5_K_S.gguf	Q5_K_S	0.83GB
llama_1b_step2_batch_v2.Q5_K.gguf	Q5_K	0.85GB
llama_1b_step2_batch_v2.Q5_K_M.gguf	Q5_K_M	0.85GB
llama_1b_step2_batch_v2.Q5_1.gguf	Q5_1	0.89GB
llama_1b_step2_batch_v2.Q6_K.gguf	Q6_K	0.95GB
llama_1b_step2_batch_v2.Q8_0.gguf	Q8_0	1.23GB

Original model description:

library_name: transformers tags: - trl - sft - generated_from_trainer model-index: - name: llama_1b_step2_batch_v2 results: []

llama_1b_step2_batch_v2

This model was trained from scratch on an unknown dataset. It achieves the following results on the evaluation set:

Loss: 0.3338

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

learning_rate: 5e-05
train_batch_size: 2
eval_batch_size: 40
seed: 42
gradient_accumulation_steps: 4
total_train_batch_size: 8
optimizer: Use adamw_torch with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
lr_scheduler_type: linear
num_epochs: 2

Training results

Training Loss	Epoch	Step	Validation Loss
0.9373	0.0341	50	0.8757
0.6838	0.0682	100	0.7815
0.7189	0.1023	150	0.7197
0.5827	0.1364	200	0.6686
0.5084	0.1704	250	0.6180
0.5357	0.2045	300	0.5858
0.4738	0.2386	350	0.5618
0.5091	0.2727	400	0.5337
0.3793	0.3068	450	0.5149
0.5388	0.3409	500	0.4985
0.4726	0.3750	550	0.4801
0.5348	0.4091	600	0.4632
0.4644	0.4432	650	0.4477
0.4033	0.4772	700	0.4367
0.4283	0.5113	750	0.4309
0.5275	0.5454	800	0.4201
0.4633	0.5795	850	0.4115
0.3312	0.6136	900	0.4038
0.4768	0.6477	950	0.3969
0.4401	0.6818	1000	0.3924
0.3125	0.7159	1050	0.3882
0.3651	0.7500	1100	0.3820
0.354	0.7840	1150	0.3770
0.3525	0.8181	1200	0.3701
0.4069	0.8522	1250	0.3659
0.2806	0.8863	1300	0.3613
0.3593	0.9204	1350	0.3584
0.3393	0.9545	1400	0.3540
0.3114	0.9886	1450	0.3504
0.2571	1.0228	1500	0.3556
0.2991	1.0569	1550	0.3531
0.2445	1.0910	1600	0.3512
0.2865	1.1251	1650	0.3520
0.2146	1.1592	1700	0.3492
0.2469	1.1933	1750	0.3481
0.2927	1.2274	1800	0.3474
0.2797	1.2615	1850	0.3454
0.247	1.2956	1900	0.3455
0.2208	1.3296	1950	0.3433
0.2396	1.3637	2000	0.3420
0.252	1.3978	2050	0.3407
0.2296	1.4319	2100	0.3387
0.2349	1.4660	2150	0.3391
0.2408	1.5001	2200	0.3374
0.236	1.5342	2250	0.3376
0.1969	1.5683	2300	0.3375
0.2513	1.6024	2350	0.3368
0.2619	1.6364	2400	0.3360
0.3016	1.6705	2450	0.3351
0.2345	1.7046	2500	0.3352
0.2474	1.7387	2550	0.3347
0.2475	1.7728	2600	0.3343
0.2627	1.8069	2650	0.3342
0.2381	1.8410	2700	0.3340
0.2984	1.8751	2750	0.3338
0.2434	1.9092	2800	0.3338
0.2608	1.9432	2850	0.3338
0.2526	1.9773	2900	0.3338

Framework versions

Transformers 4.46.0
Pytorch 2.1.0+cu118
Datasets 3.0.2
Tokenizers 0.20.1

Downloads last month: 23

GGUF

Model size

1B params

Architecture

llama

Hardware compatibility

2-bit

3-bit

4-bit

5-bit

6-bit

8-bit

Inference Providers NEW

This model isn't deployed by any Inference Provider. 🙋 Ask for provider support