Instructions to use Smith-3/simon-fcyt-umss-v2 with libraries, inference providers, notebooks, and local apps. Follow these links to get started.

Libraries

How to use Smith-3/simon-fcyt-umss-v2 with PEFT:

from peft import PeftModel
from transformers import AutoModelForCausalLM

base_model = AutoModelForCausalLM.from_pretrained("LiquidAI/LFM2.5-350M")
model = PeftModel.from_pretrained(base_model, "Smith-3/simon-fcyt-umss-v2")

Transformers

How to use Smith-3/simon-fcyt-umss-v2 with Transformers:

# Use a pipeline as a high-level helper
from transformers import pipeline

pipe = pipeline("text-generation", model="Smith-3/simon-fcyt-umss-v2")
messages = [
    {"role": "user", "content": "Who are you?"},
]
pipe(messages)

# Load model directly
from transformers import AutoTokenizer, AutoModelForMultimodalLM

tokenizer = AutoTokenizer.from_pretrained("Smith-3/simon-fcyt-umss-v2")
model = AutoModelForMultimodalLM.from_pretrained("Smith-3/simon-fcyt-umss-v2")
messages = [
    {"role": "user", "content": "Who are you?"},
]
inputs = tokenizer.apply_chat_template(
	messages,
	add_generation_prompt=True,
	tokenize=True,
	return_dict=True,
	return_tensors="pt",
).to(model.device)

outputs = model.generate(**inputs, max_new_tokens=40)
print(tokenizer.decode(outputs[0][inputs["input_ids"].shape[-1]:]))

llama-cpp-python

How to use Smith-3/simon-fcyt-umss-v2 with llama-cpp-python:

# !pip install llama-cpp-python

from llama_cpp import Llama

llm = Llama.from_pretrained(
	repo_id="Smith-3/simon-fcyt-umss-v2",
	filename="LFM2.5-350M.F16.gguf",
)

llm.create_chat_completion(
	messages = [
		{
			"role": "user",
			"content": "What is the capital of France?"
		}
	]
)

Notebooks
Google Colab
Kaggle
Local Apps Settings

llama.cpp

How to use Smith-3/simon-fcyt-umss-v2 with llama.cpp:

Install from brew

brew install llama.cpp
# Start a local OpenAI-compatible server with a web UI:
llama-server -hf Smith-3/simon-fcyt-umss-v2:Q4_K_M
# Run inference directly in the terminal:
llama-cli -hf Smith-3/simon-fcyt-umss-v2:Q4_K_M

Install from WinGet (Windows)

winget install llama.cpp
# Start a local OpenAI-compatible server with a web UI:
llama-server -hf Smith-3/simon-fcyt-umss-v2:Q4_K_M
# Run inference directly in the terminal:
llama-cli -hf Smith-3/simon-fcyt-umss-v2:Q4_K_M

Use pre-built binary

# Download pre-built binary from:
# https://github.com/ggerganov/llama.cpp/releases
# Start a local OpenAI-compatible server with a web UI:
./llama-server -hf Smith-3/simon-fcyt-umss-v2:Q4_K_M
# Run inference directly in the terminal:
./llama-cli -hf Smith-3/simon-fcyt-umss-v2:Q4_K_M

Build from source code

git clone https://github.com/ggerganov/llama.cpp.git
cd llama.cpp
cmake -B build
cmake --build build -j --target llama-server llama-cli
# Start a local OpenAI-compatible server with a web UI:
./build/bin/llama-server -hf Smith-3/simon-fcyt-umss-v2:Q4_K_M
# Run inference directly in the terminal:
./build/bin/llama-cli -hf Smith-3/simon-fcyt-umss-v2:Q4_K_M

Use Docker

docker model run hf.co/Smith-3/simon-fcyt-umss-v2:Q4_K_M

LM Studio
Jan

vLLM

How to use Smith-3/simon-fcyt-umss-v2 with vLLM:

Install from pip and serve model

# Install vLLM from pip:
pip install vllm
# Start the vLLM server:
vllm serve "Smith-3/simon-fcyt-umss-v2"
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:8000/v1/chat/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "Smith-3/simon-fcyt-umss-v2",
		"messages": [
			{
				"role": "user",
				"content": "What is the capital of France?"
			}
		]
	}'

Use Docker

docker model run hf.co/Smith-3/simon-fcyt-umss-v2:Q4_K_M

SGLang

How to use Smith-3/simon-fcyt-umss-v2 with SGLang:

Install from pip and serve model

# Install SGLang from pip:
pip install sglang
# Start the SGLang server:
python3 -m sglang.launch_server \
    --model-path "Smith-3/simon-fcyt-umss-v2" \
    --host 0.0.0.0 \
    --port 30000
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:30000/v1/chat/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "Smith-3/simon-fcyt-umss-v2",
		"messages": [
			{
				"role": "user",
				"content": "What is the capital of France?"
			}
		]
	}'

Use Docker images

docker run --gpus all \
    --shm-size 32g \
    -p 30000:30000 \
    -v ~/.cache/huggingface:/root/.cache/huggingface \
    --env "HF_TOKEN=<secret>" \
    --ipc=host \
    lmsysorg/sglang:latest \
    python3 -m sglang.launch_server \
        --model-path "Smith-3/simon-fcyt-umss-v2" \
        --host 0.0.0.0 \
        --port 30000
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:30000/v1/chat/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "Smith-3/simon-fcyt-umss-v2",
		"messages": [
			{
				"role": "user",
				"content": "What is the capital of France?"
			}
		]
	}'

Ollama
How to use Smith-3/simon-fcyt-umss-v2 with Ollama:
```
ollama run hf.co/Smith-3/simon-fcyt-umss-v2:Q4_K_M
```

Unsloth Studio

How to use Smith-3/simon-fcyt-umss-v2 with Unsloth Studio:

Install Unsloth Studio (macOS, Linux, WSL)

curl -fsSL https://unsloth.ai/install.sh | sh
# Run unsloth studio
unsloth studio -H 0.0.0.0 -p 8888
# Then open http://localhost:8888 in your browser
# Search for Smith-3/simon-fcyt-umss-v2 to start chatting

Install Unsloth Studio (Windows)

irm https://unsloth.ai/install.ps1 | iex
# Run unsloth studio
unsloth studio -H 0.0.0.0 -p 8888
# Then open http://localhost:8888 in your browser
# Search for Smith-3/simon-fcyt-umss-v2 to start chatting

Using HuggingFace Spaces for Unsloth

# No setup required
# Open https://huggingface.co/spaces/unsloth/studio in your browser
# Search for Smith-3/simon-fcyt-umss-v2 to start chatting

How to use Smith-3/simon-fcyt-umss-v2 with Pi:

Start the llama.cpp server

# Install llama.cpp:
brew install llama.cpp
# Start a local OpenAI-compatible server:
llama-server -hf Smith-3/simon-fcyt-umss-v2:Q4_K_M

Configure the model in Pi

# Install Pi:
npm install -g @mariozechner/pi-coding-agent
# Add to ~/.pi/agent/models.json:
{
  "providers": {
    "llama-cpp": {
      "baseUrl": "http://localhost:8080/v1",
      "api": "openai-completions",
      "apiKey": "none",
      "models": [
        {
          "id": "Smith-3/simon-fcyt-umss-v2:Q4_K_M"
        }
      ]
    }
  }
}

Run Pi

# Start Pi in your project directory:
pi

Hermes Agent new

How to use Smith-3/simon-fcyt-umss-v2 with Hermes Agent:

Start the llama.cpp server

# Install llama.cpp:
brew install llama.cpp
# Start a local OpenAI-compatible server:
llama-server -hf Smith-3/simon-fcyt-umss-v2:Q4_K_M

Configure Hermes

# Install Hermes:
curl -fsSL https://hermes-agent.nousresearch.com/install.sh | bash
hermes setup
# Point Hermes at the local server:
hermes config set model.provider custom
hermes config set model.base_url http://127.0.0.1:8080/v1
hermes config set model.default Smith-3/simon-fcyt-umss-v2:Q4_K_M

Run Hermes

hermes

Atomic Chat new
Docker Model Runner
How to use Smith-3/simon-fcyt-umss-v2 with Docker Model Runner:
```
docker model run hf.co/Smith-3/simon-fcyt-umss-v2:Q4_K_M
```

Lemonade

How to use Smith-3/simon-fcyt-umss-v2 with Lemonade:

Pull the model

# Download Lemonade from https://lemonade-server.ai/
lemonade pull Smith-3/simon-fcyt-umss-v2:Q4_K_M

Run and chat with the model

lemonade run user.simon-fcyt-umss-v2-Q4_K_M

List all available models

lemonade list

simon-fcyt-umss-v2

Modelo de IA ajustado para la Facultad de Ciencias y Tecnología de la Universidad Mayor de San Simón (UMSS), diseñado para la aplicación TecnoTime. Su objetivo es ayudar a estudiantes a mantenerse organizados, motivados y conectados con sus actividades académicas mediante respuestas breves, consistentes y parseables.

Este modelo es un fine-tuning de LiquidAI/LFM2.5-350M con SFT + LoRA usando Unsloth. Está orientado a producir salidas compatibles con la estructura SimonResponse usada por el sistema Android.

Repositorio destino:

Smith-3/simon-fcyt-umss-v2

Propósito Académico

El modelo está orientado a:

Recordatorios de clases y horarios.
Mensajes motivacionales para mejorar continuidad académica.
Check-ins cortos para acompañamiento del progreso diario.
Estímulo positivo en momentos clave del semestre.

Está pensado para reforzar hábitos de constancia, asistencia, estado de ánimo y autocuidado académico.

Formato Estructurado de Respuesta

La salida esperada no es texto libre. El modelo debe generar un único objeto JSON válido, sin Markdown alrededor, con campos compatibles con TecnoTime.

Campos base:

{
  "why": "...",
  "opening_tag": "...",
  "icon_tag": "..."
}

Campos por tipo de notificación:

Tipo de notificación	Requiere	No debe incluir
`MOTIVATIONAL_CHECK_IN`	`question`, `choices` con 2 opciones	-
`CLASS_REMINDER`	`question`, `choices` con 2 opciones	-
`CHECK_IN_CLOSURE`	`message` o `reminder_copy`	`choices`
`REMINDER_CLOSURE`	`reminder_copy`	`choices`

Ejemplos Reales Usados en TecnoTime

Recordatorio de Clase

{
  "why": "Para que no pierdas tu ritmo académico",
  "opening_tag": "¡Atención, estudiante FCyT!",
  "question": "¿Listo para tu clase de hoy?",
  "icon_tag": "info",
  "choices": [
    {"id": "yes", "label": "Llego en un momento"},
    {"id": "no", "label": "Estoy terminando algo"}
  ]
}

Check-in Motivacional

{
  "why": "Para reforzar tu constancia",
  "opening_tag": "¡Vamos con toda, ingeniería!",
  "question": "¿Cómo te sientes con lo que has avanzado hoy?",
  "icon_tag": "focus",
  "choices": [
    {"id": "yes", "label": "Bien, sigo avanzando"},
    {"id": "no", "label": "Hoy fue complicado"}
  ]
}

Modelo Base y Entrenamiento

Modelo base: LiquidAI/LFM2.5-350M
Tokenizer usado en el notebook: unsloth/LFM2.5-1.2B-Instruct
Método: SFT + LoRA
Framework: Unsloth + TRL + PEFT
Dataset: data/processed/train_data.jsonl
Formato del dataset: LiquidAI messages

Configuración de entrenamiento v2:

Parámetro	Valor
Épocas	3
Batch por dispositivo	4
Acumulación de gradiente	4
Batch efectivo	16
Learning rate	`2e-4`
Scheduler	`cosine`
Warmup ratio	`0.03`
LoRA rank	`16`
LoRA alpha	`32`
LoRA dropout	`0.05`

Archivos Esperados

Este repositorio está preparado para incluir:

Archivo o variante	Uso recomendado
Adaptadores LoRA	Respaldo reproducible del fine-tuning
Modelo fusionado 16-bit	Versión principal para despliegue o conversión
`F16` GGUF	Referencia de mayor tamaño
`Q8_0` GGUF	Alta calidad, requiere más RAM
`Q6_K` GGUF	Buen equilibrio con alta calidad
`Q5_K_M` GGUF	Mejor equilibrio calidad/rendimiento
`Q4_K_M` GGUF	Equipos de bajos recursos, laboratorios o laptops modestas
`Q3_K_M` GGUF	Variante más liviana

Uso con llama.cpp

Ejemplo con una cuantización GGUF:

llama-cli -m simon-fcyt-umss-v2-Q5_K_M.gguf -p "GENERAR: CLASS_REMINDER para clase de Física"

Uso con Transformers

from transformers import AutoTokenizer, AutoModelForCausalLM

repo_id = "Smith-3/simon-fcyt-umss-v2"

tokenizer = AutoTokenizer.from_pretrained(repo_id)
model = AutoModelForCausalLM.from_pretrained(repo_id, torch_dtype="auto")

messages = [
    {
        "role": "user",
        "content": "GENERAR: MOTIVATIONAL_CHECK_IN para estudiante de Ingeniería de Sistemas"
    }
]

prompt = tokenizer.apply_chat_template(
    messages,
    tokenize=False,
    add_generation_prompt=True,
)

inputs = tokenizer(prompt, return_tensors="pt")
outputs = model.generate(**inputs, max_new_tokens=180)
print(tokenizer.decode(outputs[0], skip_special_tokens=True))

Uso Previsto

Uso previsto:

Integración en TecnoTime.
Generación de respuestas JSON para notificaciones académicas.
Apoyo motivacional breve para estudiantes.
Inferencia local o de bajo costo con GGUF.

Fuera de alcance:

Información académica oficial sin validación externa.
Consejería psicológica, médica o legal.
Decisiones administrativas o académicas automatizadas.

Licencia

Este modelo deriva de LiquidAI/LFM2.5-350M. Revisa la licencia del modelo base en Hugging Face antes de redistribuir o usar comercialmente:

https://huggingface.co/LiquidAI/LFM2.5-350M

Estado

Adaptadores LoRA v2: preparados para publicación.
Modelo fusionado: debe generarse desde el notebook antes de subir.
GGUF completo: debe generarse desde el notebook antes de subir.
Próxima etapa: publicación e integración final en TecnoTime.

Framework versions

PEFT 0.17.1

Downloads last month: 337

Safetensors

Model size

0.4B params

Tensor type

BF16

Model tree for Smith-3/simon-fcyt-umss-v2

Base model

LiquidAI/LFM2.5-350M-Base

Finetuned

LiquidAI/LFM2.5-350M

Adapter

(11)

this model